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Abstract 

Minkowski Space is the simplest four-dimensional Lorentzian Mani- 
fold, being topologically trivial and globally flat, and hence the sim- 
plest model of spacetime — from a General-Relativistic point of view. 
But this does not mean that it is altogether structurally trivial. In 
fact, it has a very rich structure, parts of which will be spelled out in 
detail in this contribution, which is written for Minkowski Spacetime: 
A Hundred Years Later, edited by Vesselin Petkov, to appear in 2008 
in the Springer Series on Fundamental Theories of Physics, Springer 
Verlag, Berlin. 
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1 General Introduction 



There are many routes to Minkowski space. But the most physical one still 
seems to me via the law of inertia. And even along these lines alternative 
approaches exist. Many papers were published in physics and mathemat- 
ics journals over the last 100 years in which incremental progress was re- 
ported as regards the minimal set of hypotheses from which the structure 
of Minkowski space could be deduced. One could imagine a Hesse-diagram- 
like picture in which all these contributions (being the nodes) together with 
their logical dependencies (being the directed links) were depicted. It would 
look surprisingly complex. 

From a General-Relativistic point of view, Minkowski space just models 
an empty spacetime, that is, a spacetime devoid of any material content. It 
is worth keeping in mind, that this was not Minkowski's view. Close to the 
beginning of Raum und Zeit he statedly] 

In order to not leave a yawning void, we wish to imagine that 
at every place and at every time something perceivable exists. 

This already touches upon a critical point. Our modern theoretical view of 
spacetime is much inspired by the typical hierarchical thinking of mathemat- 
ics of the late 19th and first half of the 20th century, in which the set comes 
first, and then we add various structures on it. We first think of spacetime 
as a set and then structure it according to various physical inputs. But what 
are the elements of this set? Recall how Georg Cantor, in his first article on 
transfinite set-theory, defined a setl 

By a 'set' we understand any gathering-together M of deter- 
mined well- distinguished objects ra of our intuition or of our 
thinking (which are called the 'elements' of M) into a whole. 

Do we think of spacetime points as "determined well-distinguished objects 
of our intuition or of our thinking"? I think Minkowski felt a need to do 
so, as his statement quoted above indicates, and also saw the problematic 
side of it: If we mentally individuate the points (elements) of spacetime, 
we — as physicists — have no other means to do so than to fill up spacetime 
with actual matter, hoping that this could be done in such a diluted fash- 
ion that this matter will not dynamically affect the processes that we are 
going to describe. In other words: The whole concept of a rigid background 
spacetime is, from its very beginning, based on an assumption of — at best — 
approximate validity. It is important to realise that this does not necessarily 

^ German original: "Um nirgends eine gahnende Leere zu lassen, woUen wir uns vorstellen, 
dafi allerorten und zu jeder Zeit etwas Wahrnehmbares vorhanden ist". P- 2) 

^ German original: "Unter einer 'Menge' verstehen wir jede Zusammenfassung M von bes- 
timmten wohlunterschiedenen Objecten m unserer Anschauung oder unseres Denkens 
(welche die 'Elemente' von M genannt werden) zu einem Ganzen." ([T51, p. 481) 
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refer to General Relativity: Even if the need to incorporate gravity by a 
variable and matter-dependent spacetime geometry did not exist would the 
concept of a rigid background spacetime be of approximate nature, provided 
we think of spacetime points as individuated by actual physical events. 

It is true that modern set theory regards Cantor's original definition 
as too naive, and that for good reasons. It allows too many "gatherings- 
together" with self-contradictory properties, as exemplified by the infamous 
antinomies of classical set theory. Also, modern set theory deliberately 
stands back from any characterisation of elements in order to not confuse 
the axioms themselves with their possible interpretations^ However, ap- 
plications to physics require interpreted axioms, where it remains true that 
elements of sets are thought of as definite as in Cantors original definition. 

Modern textbooks on Special Relativity have little to say about this, 
though an increasing unease seems to raise its voice from certain directions 
in the philosophy-of-science community; see, e.g., |llj|10) . Physicists some- 
times tend to address points of spacetime as potential events, but that always 
seemed to me like poetrj0, begging the question how a mere potentiality is 
actually used for individuation. To me the right attitude seems to admit 
that the operational justification of the notion of spacetime events is only 
approximately possible, but nevertheless allow it as primitive element of the- 
orising. The only thing to keep in mind is to not take mathematical rigour 
for ultimate physical validity. The purpose of mathematical rigour is rather 
to establish the tightest possible bonds between basic assumptions (axioms) 
and decidable consequences. Only then can we — in principle — learn any- 
thing through falsification. 

The last remark opens another general issue, which is implicit in much 
of theoretical research, namely how to balance between attempted rigour 
in drawing consequences and attempted closeness to reality when formulat- 
ing once starting platform (at the expense of rigour when drawing conse- 
quences). As the mathematical physicists Glance & Wightman once for- 
mulated it in a different context (that of superselection rules in Quantum 
Mechanics): 

The theoretical results currently available fall into two cate- 
gories: rigorous results on approximate models and approximate 
results in realistic models. ([48j, p. 204) 

^ This urge for a clean distinction between the axioms and their possible interpretations 
is contained in the famous and amusing dictum, attributed to David Hilbert by his 
student Otto Blumenthal: "One must always be able to say 'tables', 'chairs', and 'beer 
mugs' instead of 'points, 'lines', and 'planes". (German original: "Man mufi jederzeit an 
Stelle von 'Punkten', 'Geraden' und 'Ebenen' 'Tische', 'Stiihle' und 'Bierseidel' sagen 
konnen.") 

* "And as imagination bodies forth The forms of things unknown, the poet's pen Turns 
them to shapes, and gives to airy nothing A local habitation and a name." (A Midsum- 
mer Night's Dream, Theseus at V,i) 
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To me this seems to be the generic situation in theoretical physics. In that 
respect, Minkowski space is certainly an approximate model, but to a very 
good approximation indeed: as global model of spacetime if gravity plays 
no dynamical role, and as local model of spacetime in far more general situ- 
ations. This justifies looking at some of its rich mathematical structures in 
detail. Some mathematical background material is provided in the Appen- 
dices. 

2 Minkowski space and its partial automorphisms 

2.1 Outline of general strategy 

Consider first the general situation where one is given a set S. Without 
any further structure being specified, the automorphisms group of S would 
be the group of bijections of S, i.e. maps f : S — > S which are injective 
(into) and surjective (onto). It is called Perm(S), where 'Perm' stands for 
'permutations'. Now endow S with some structure A; for example, it could 
be an equivalence relation on S, that is, a partition of S into an exhaustive 
set of mutually disjoint subsets (cf. Sect. lA.l]) . The automorphism group of 
(S, A) is then the subgroup of Perm(S | A) C Perm(S] that preserves A. Note 
that Perm(S | A] contains only those maps f preserving A whose inverse, 
also preserve A. Now consider another structure. A', and form Perm(S | 
A'). One way in which the two structures A and A' may be compared is 
to compare their automorphism groups Perm(S | A] and Perm(S | A']. 
Comparing the latter means, in particular, to see whether one is contained 
in the other. Containedness clearly defines a partial order relation on the 
set of subgroups of Perm(S), which we can use to define a partial order on 
the set of structures. One structure. A, is said to be strictly stronger than 
(or equally strong as) another structure. A', in symbols A > A', iflEI the 
automorphism group of the former is properly contained in (or is equal to) 
the automorphism group of the latterlf] In symbols: A > A' <4> Perm(S | 
A) C Perm(S | A'). Note that in this way of speaking a substructure (i.e. 
one being defined by a subset of conditions, relations, objects, etc.) of a 
given structure is said to be weaker than the latter. This way of thinking 
of structures in terms of their automorphism group is adopted from Felix 
Klein's Erlanger Programm [34j in which this strategy is used in an attempt 
to classify and compare geometries. 

This general procedure can be applied to Minkowski space, endowed with 
its usual structure (see below). We can than ask whether the automorphism 
group of Minkowski space, which we know is the inhomogeneous Lorentz 
group I Lor, also called the Poincare group, is already the automorphism 

^ Throughout we use 'iff' as abbreviation for 'if and only if. 

® Strictly speaking, it would be more appropriate to speak of conjugacy classes of sub- 
groups in Perm(S) here. 
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group of a proper substructure. If this were the case we would say that the 
original structure is redundant. It would then be of interest to try and find 
a minimal set of structures that already imply the Poincare group. This 
can be done by trial and error: one starts with some more or less obvious 
substructure, determine its automorphism group, and compare it to the 
Poincare group. Generically it will turn out larger, i.e. to properly contain 
I Lor. The obvious questions to ask then are: how much larger? and: what 
would be a minimal extra condition that eliminates the difference? 

2.2 Definition of Minkowski space and Poincare group 

These questions have been asked in connection with various substructures 
of Minkowski space, whose definition is as follows: 

Definition 1. Minkowski space of n > 2 dimensions, denoted by M"^, 
is a real n-dimensional affine space, whose associated real n-dimensional 
vector space V is endowed with a non-degenerate symmetric bilinear form 
g : Vx V — > M of signature (1 ,u— 1 ) (i.e. there exists a basis {eo, ei , • • • , en-l} 
of V such that g[ea, ^b) = diag(1 , — 1 , • • • , — 1 )). M"^ is also endowed with 
the standard differentiable structure of W^. 

We refer to Appendix lA.2l for the definition of affine spaces. Note also 
that the last statement concerning differentiable structures is put in in view 
of the strange fact that just for the physically most interesting case, n = 
4, there exist many inequivalent differentiable structures of M'^. Finally 
we stress that, at this point, we did not endow Minkowski space with an 
orientation or time orientation. 

Definition 2. The Poincare group in n > 2 dimensions, which is the 
same as the inhomogeneous Lorentz group in n > 2 dimensions and 
therefore will be denoted by ILor"^, is that subgroup of the general affine 
group of real n-dimensional affine space, for which the uniquely associated 
linear maps f : V — > V are elements of the Lorentz group Lor"', that is, 
preserve g in the sense that g(f(v),f(w)) = g(v,w) for all v,w G V. 

See App endix l A . 3 1 for the definition of affine maps and the general affine 
group. Again we stress that since we did not endow Minkowski space with 
any orientation, the Poincare group as defined here would not respect any 
such structure. 

As explained in IA.41 any choice of an affine frame allows us to identify 
the general affine group in n dimensions with the semi-direct product x 
GL(n). That identification clearly depends on the choice of the frame. If we 
restrict the bases to those where g(ea, eb) = diag(l , — 1 , • • • , — 1 ), then ILor"' 
can be identified with MJ^ x 0(1 , n — 1 ). 

We can further endow Minkowski space with an orientation and, inde- 
pendently, a time orientation. An orientation of an affine space is equivalent 
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to an orientation of its associated vector space V. A time orientation is also 
defined trough a time orientation of V, which is explained below. The sub- 
group of the Poincare group preserving the overall orientation is denoted by 
ILor^ (proper Poincare group), the one preserving time orientation by ILor|' 
(orthochronous Poincare group), and ILor^-^ denotes the subgroup preserving 
both (proper orthochronous Poincare group). 

Upon the choice of a basis we may identify ILor^ with x S0(1 ,n— 1 ) 
and ILor^^ with x SOo(l , tl — 1 ), where SOo(l , n — 1 ) is the component 
of the identity of S0(1 ,n — 1 ). 

Let us add a few more comments about the elementary geometry of 
Minkowski space. We introduce the following notations: 



v-w:=g(v,w) and ||v||g := \/|g(v,v]| . (1) 

We shall also simply write for v • v. A vector v G V is called timelike, 
lightlike, or spacelike according to being > 0, = 0, or < respectively. 
Non-spacelike vectors are also called causal and their set, C C V, is called 
the causal- douhlecone. Its interior, C, is called the chronological- doublecone 
and its boundary, C, the light- doublecone: 



C 
C 
L 



{v G V I > 0} , (2a) 
{v G V I > 0} , (2b) 
{v G V I = 0} . (2c) 



A linear subspace V C V is called timelike, lightlike, or spacelike ac- 
cording to g|y, being indefinite, negative semi-definite but not negative def- 
inite, or negative definite respectively. Instead of the usual Cauchy-Schwarz- 
inequality we have 

v^w^ < (v-w)'^ for span{v,w} timelike , (3a) 
v^w^ = (v-w)^ for span{v,w} lightlike , (3b) 
v^w^ > (v-w)^ for span{v,w} spacelike . (3c) 

Given a set W C V (not necessarily a subspac^), its g-orthogonal com- 
plement is the subspace 

:= {v G V I V • w = 0, Vw G W} . (4) 

If V G V is lightlike then v G v^. In fact, V"*- is the unique lightlike hyper- 
plane (cf. Sect. lA.2]) containing v. In this case the hyperplane is called 
degenerate because the restriction of g to v-*- is degenerate. On the other 
hand, if v is timelike/spacelike V"*- is spacelike/timelike and v V"*-. Now 



By a 'subspace' of a vector space we always understand a sub vector-space. 
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the hyperplane v is called non-degenerate because the restriction of g to 
V"*- is non-degenerate. 

Given any subset W C V, we can attach it to a point p in M"^: 

Wp := p + W := {p + w I w G W} . (5) 

In particular, the causal-, chronological-, and light-doublecones at p G M"^ 
are given by: 

C^:=V + C, (6a) 

C^:=V + C, (6b) 

£p : = p + £ . (6c) 

If W is a subspace of V then Wp is an affine subspace of over W. If W 
is time-, light-, or spacelike then Wp is also called time-, light-, or spacelike. 
Of particular interest are the hyperplanes Vp which are timelike, lightlike, 
or spacelike according to v being spacelike, lightlike, or timelike respectively. 

Two points p,q G are said to be timelike-, lightlike-, or spacelike 
separated if the line joining them (equivalently: the vector p — q) is timelike, 
lightlike, or spacelike respectively. Non-spacelike separated points are also 
called causally separated and the line though them is called a causal line. 

It is easy to show that the relation v~w<4v-w>0 defines an 
equivalence relation (cf. Sect. lA.ip on the set of timelike vectors. (Only 
transitivity is non-trivial, i.e. if u • v > and v • w > then u • w > 
0. To show this, decompose u and w into their components parallel and 
perpendicular to v.) Each of the two equivalence classes is a cone in V, that 
is, a subset closed under addition and multiplication with positive numbers. 
Vectors in the same class are said to have the same time orientation. In 
the same fashion, the relation v~w<=>v-w>0 defines an equivalence 
relation on the set of causal vectors, with both equivalence classes being 
again cones. The existence of these equivalence relations is expressed by 
saying that M"' is time orientable. Picking one of the two possible time 
orientations is then equivalent to specifying a single timelike reference vector, 
V* , whose equivalence class of directions may be called the future. This being 
done we can speak of the future (or forward, indicated by a superscript -|-) 
and past (or backward, indicated by a superscript — ) cones: 

C± : = {v G C I V • V, ^ 0} , (7a) 
C± : = {v G C I V • V, ^ 0} , (7b) 
£± := {v G £ I v-v, ^ 0}. (7c) 

Note that = U and n = 0. Usually £+ is cahed the future 
and £^ the past lightcone. Mathematically speaking this is an abuse of 
language since, in contrast to and C^, they are not cones: They are 
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= p+C± 


•-p 


= p + C± 




= P + £^ 



each invariant (as sets) under multiplication with positive real numbers, but 
adding to vectors in will result in a vector in unless the vectors were 
parallel. 

As before, these cones can be attached to the points in M"^. We write in 
a straightforward manner: 



(8b) 
(8c) 

The Cauchy-Schwarz inequalities ([3]) result in various generalised 
triangle-inequalities. Clearly, for spacelike vectors, one just has the ordi- 
nary triangle inequality. But for causal or timelike vectors one has to distin- 
guish the cases according to the relative time orientations. For example, for 
timelike vectors of equal time orientation, one obtains the reversed triangle 
inequality: 

||V+W||g > ||v||g + ||w||g , (9) 

with equality iff v and w are parallel. It expresses the geometry behind the 
'twin paradox'. 

Sometimes a Minkowski 'distance function' d : M"^ x — ) M is intro- 
duced through 

d(p,q) := ||p-q||g. (10) 

Clearly this is not a distance function in the ordinary sense, since it is neither 
true that d(p, q) = <^ p = q nor that d(p, w) -|- d(w, q) > d(p, q) for all 
p,q,w. 



2.3 From metric to afRne structures 

In this section we consider general isometrics of Minkowski space. By this 
we mean general bijections F : — > M"- (no requirement like continuity or 
even linearity is made) which preserve the Minkowski distance (llOh as well 
as the time or spacelike character; hence 

(F(p)-F(q))^ = (p-q)2 for ah p.qeM^. (11) 

Poincare transformations form a special class of such isometrics, namely 
those which are affine. Are there non-affine isometrics? One might expect 
a whole Pandora's box full of wild (discontinuous) ones. But, fortunately, 
they do not exist: Any map f : V — > V satisfying (f (v))^ = for all v must 
be linear. As a warm up, we show 

Theorem 1. Let f : V — ) V 6e a surjection (no further conditions) so that 
f (v) • f (w) = V • w for all v, w G V, then f is linear. 
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Proof. Consider I := (af(u) +bf(v) — f(au + bv)) • w. Surjectivity allows to 
write w = f (z) , so that I = au-z + bv-z— ( au + bv) • z, which vanishes for 
all z G V. Hence 1 = for all w G V, which by non-degeneracy of g implies 
the linearity of f . □ 

This shows in particular that any bijection F : — > M"^ of Minkowski 
space whose associated map f : V — > V, defined by f (v) := F(o + v) — F(o) 
for some chosen basepoint o, preserves the Minkowski metric must be a 
Poincare transformation. As already indicated, this result can be consider- 
ably strengthened. But before going into this, we mention a special and 
important class of linear isometrics of (V, g), namely reflections at non- 
degenerate hyperplanes. The reflection at V"*- is defined by 

PvW:=x-2v^. (12) 

Their significance is due to the following 

Theorem 2 (Cartan, Dieudonne). Let the dimension ofV be n. Any isom- 
etry of (V, g) is the composition of at most u reflections. 

Proof. Comprehensive proofs may be found in jST] or [5]. The easier proof 
for at most 2n — 1 reflections is as follows: Let cj) be a linear isometry 
and V G V so that 7^ (which certainly exists). Let w = c()(v), then 
(v+w)^+ (v — w)'^ = 4v'^ 7^ so that w+v and w — v cannot simultaneously 
have zero squares. So let (v =F w)^ 7^ (understood as alternatives), then 
Pv=Fw(v) = ±w and pvq:-w(w) = itv. Hence v is eigenvector with eigenvalue 
1 of the linear isometry given by 



4)' = ^^^-°* inv-w)Vo, ^^3^ 

Pvopv+woct* if(v — w)^ = 0. 



Consider now the linear isometry cj^'l^x on V"*- with induced bilinear form 
g|^j^, which is non-degenerated due to 7^ 0. We conclude by induction: 
At each dimension we need at most two reflections to reduce the problem 
by one dimension. After n — 1 steps we have reduced the problem to one 
dimension, where we need at most one more reflection. Hence we need at 
most 2(n— 1 ) + l = 2n— 1 reflections which upon composition with produce 
the identity. Here we use that any linear isometry in V"*" can be canonically 
extended to span{v} © v-*- by just letting it act trivially on span{v}. □ 

Note that this proof does not make use of the signature of g. In fact, the 
theorem is true for any signatures; it only depends on g being symmetric 
and non degenerate. 
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2.4 From causal to afRne structures 

As already mentioned, Theorem[T] can be improved upon, in the sense that 
the hypothesis for the map being an isometry is replaced by the hypothesis 
that it merely preserve some relation that derives form the metric structure, 
but is not equivalent to it. In fact, there are various such relations which 
we first have to introduce. 

The family of cones {C^ \ q G M^} defines a partial-order relation 
(cf. Sect. lA.l]) . denoted by >, on spacetime as follows: p > q iff p G C+, i.e. 
iff p — q is causal and future pointing. Similarly, the family {C^ \ q E M"^} 
defines a strict partial order, denoted by >, as follows: p > q iff p € C+, i.e. 
if p — q is timelike and future pointing. There is a third relation, called >, 
defined as follows: p > q iff p G C'^, i.e. p is on the future lightcone at q. It 
is not a partial order due to the lack of transitivity, which, in turn, is due 
to the lack of the lightcone being a cone (in the proper mathematical sense 
explained above). Replacing the future (+) with the past (— ) cones gives 
the relations <, <, and <. 

It is obvious that the action of ILor^ (spatial reflections are permitted) 
on M"^ maps each of the six families of cones ([8]) into itself and therefore 
leave each of the six relations invariant. For example: Let p > q and 
F G ILor^, then (p — q)^ > and p — q future pointing, but also (F(p) — 
F(q))^ > and F(p) — F(q) future pointing, hence F(p) > F(q). Another set 
of 'obvious' transformations of M"^ leaving these relations invariant is given 
by all dilations: 

d(A,^):M^^M^, p^ d(A,^)(p):=A(p-m) + m, (14) 

where A G M-|- is the constant dilation-factor and m G M"^ the centre. This 

follows from (dA,m(p) - dA,m(q))^ = A^(p - q)^, (dA,m(p) - dA,m(q)) • V* = 
A(p — q) -v*, and the positivity of A. Since translations are already contained 
m ILorT, the group generated by I Lor^ and all dA m is the same as the group 
generated by ILor^ and all dA,m. for fixed m. 

A seemingly difficult question is this: What are the most general trans- 
formations of that preserve those relations? Here we understand 'trans- 
formation' synonymously with 'bijective map', so that each transformation f 
has in inverse f . 'Preserving the relation' is taken to mean that f and f 
preserve the relation. Then the somewhat surprising answer to the question 
just posed is that, in three or more spacetime dimensions, there are no other 
such transformations besides those already listed: 

Theorem 3. Let >- stand for any of the relations >,>,> and let ¥ be a 
bijection of M.^ with n > 3, such that p ;^ q implies F(p) >- F(q) and 
F^^ (p ) >- F^^ ( q ) . Then F is the composition of an Lorentz transformation 
in ILor^ with a dilation. 
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Proof. These results were proven by A.D. Alexandrov and independently 
by E.G. Zeeman. A good review of Alexandrov's results is [1]; Zeeman's 
paper is jl9]. The restriction to u > 3 is indeed necessary, as for n = 2 
the following possibility exists: Identify with and the bilinear form 
g (z, z) = — y ^, where z = (x, y ) . Set u := x — y and v := x + y and define 
f : R2 ^ R2 by f(u,v) := (h(u),h(v)), where h : M ^ R is any smooth 
function with h' > 0. This defines an orientation preserving diffeomorphism 
of which transforms the set of lines u = const, and v = const, respectively 
into each other. Hence it preserves the families of cones (|8ap . Since these 
transformations need not be affine linear they are not generated by dilations 
and Lorentz transformations. □ 

These results may appear surprising since without a continuity require- 
ment one might expect all sorts of wild behaviour to allow for more pos- 
sibilities. However, a little closer inspection reveals a fairly obvious reason 
for why continuity is implied here. Consider the case in which a transfor- 
mation F preserves the families {C^ \ q G M"^} and {C^ \ q G M^}. The open 
diamond-shaped sets (usually just called 'open diamonds'), 

U(p,q):=(C+nCq)U(C+nC^), (15) 

are obviously open in the standard topology of M"^ (which is that of R"^). 
Note that at least one of the intersections in (jlSp is always empty. Con- 
versely, is is also easy to see that each open set of contains an open 
diamond. Hence the topology that is defined by taking the U(p, q) as sub- 
base (the basis being given by their finite intersections) is equivalent to the 
standard topology of M^. But, by hypothesis, F and F^^ preserves the cones 
Cq and therefore open sets, so that F must, in fact, be a homeomorphism. 

There is no such obvious continuity input if one makes the strictly weaker 
requirement that instead of the cones ([8]) one only preserves the doublecones 
([6]). Does that allow for more transformations, except for the obvious time 
reflection? The answer is again in the negative. The following result was 
shown by Alexandrov (see his review [Ij) and later, in a different fashion, 
by Borchers and Hegerfeld [8|: 

Theorem 4. Let ~ denote any of the relations: p ~ q ijQ^ (p — q)^ > 0, p ~ q 
^if (P ~ > 0; or p ~ q ij^ (p — q)^ = 0. Let ¥ be a bijection of M"^ with 
u > 3, such that p ~ q implies F(p) ~ F(q) and F^^(p) ~ F^^(q). Then F is 
the composition of an Lorentz transformation in I Lor with a dilation. 

All this shows that, up to dilations, Lorentz transformations can be 
characterised by the causal structure of Minkowski space. Let us focus on 
a particular sub-case of TheoremHJ which says that any bijection F of M"^ 
with n > 3, which satisfies ||p — q||g = ||F(p) — F(q)||g = must be the 
composition of a dilation and a transformation in I Lor. This is sometimes 
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referred to as Alexandrov's theorem. It gives a precise answer to the following 
physical question: To what extent does the principle of the constancy of a 
finite speed of light alone determine the relativity group? The answer is, 
that it determines it to be a subgroup of the 11-parameter group of Poincare 
transformations and constant rescalings, which is as close to the Poincare 
group as possibly imaginable. 

Alexandrov's Theorem is, to my knowledge, the closest analog in 
Minkowskian geometry to the famous theorem of Beckman and Quarles [3] , 
which refers to Euclidean geometry and reads as follow^: 

Theorem 5 (Beckman and Quarles 1953). Let for n > 2 be endowed 
with the standard Euclidean inner product {■ \ ■). The associated norm is 
given by \\x\\ := y^{x\x). Let 5 be any fixed positive real number and f : 
R"- — > M"- any map such that — y|| = 8 =^ WiM — f(y)|| = 5; then i is a 
Euclidean motion, i.e. f S M"^ xi O(rL). 

Note that there are three obvious points which let the result of Beckman 
and Quarles in Euclidean space appear somewhat stronger than the theorem 
of Alexandrov in Minkowski space: 

1. The conclusion of Theorem[5] holds for any 8 £ whereas Alexan- 
drov's theorem singles out lightlike distances. 

2. In Theorem[5l n = 2 is not excluded. 

3. In Theorem[5l f is not required to be a bijection, so that we did not 
assume the existence of an inverse map f . Correspondingly, there is 
no assumption that f^^ also preserves the distance 6. 

2.5 The impact of the law of inertia 

In this subsection we wish to discuss the extent to which the law of inertia 
already determines the automorphism group of spacetime. 

The law of inertia privileges a subset of paths in spacetime form among 
all paths; it defines a so-called path structure ^^[16]. These privileged paths 
correspond to the motions of privileged objects called free particles. The 
existence of such privileged objects is by no means obvious and must be taken 
as a contingent and particularly kind property of nature. It has been known 

* In fact, Beckman and Quarles proved the conclusion of Theorem[^ under slightly weaker 
hypotheses: They allowed the map f to be 'many- valued', that is, to be a map f : R'^ — > 
<S'^ , where iS" is the set of non-empty subsets of R'^, such that ||x—y|| = 6 =^ |jx'— y'|| = 
6 for any x' £ f(x) and any y' G f(y). However, given the statement of Theorem^ it 
is immediate that such 'many-valued maps' must necessarily be single- valued. To see 
this, assume that x« £ R'^ has the two image points yi ,y2 and define hi : R'^ — > R"' 
for t = 1,2 such that H] (x) = h2(x) £ f(x) for all x 7^ x« and hi(xt) = xji. Then, 
according to Theorem[5l hi must both be Euclidean motions. Since they are continuous 
and coincide for all x 7^ x, , they must also coincide at x* . 
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for long |35j |45j [31] how to operationally construct timescales and spatial 
reference frames relative to which free particles will move uniformly and on 
straight lines respectively — all of them! (A summary of these papers is given 
in [25].) These special timescales and spatial reference frames were termed 
inertial by Ludwig Lange [35]. Their existence must again be taken as a very 
particular and very kind feature of Nature. Note that 'uniform in time' and 
'spatially straight' together translate to 'straight in spacetime'. We also 
emphasise that 'straightness' of ensembles of paths can be characterised 
intrinsically, e.g., by the Desargues property [H]. All this is true if free 
particles are given. We do not discuss at this point whether and how one 
should characterise them independently (cf. [23J). 

The spacetime structure so defined is usually referred to as projective. 
It it not quite that of an affine space, since the latter provides in addition 
each straight line with a distinguished two-parameter family of parametrisa- 
tions, corresponding to a notion of uniformity with which the line is traced 
through. Such a privileged parametrisation of spacetime paths is not pro- 
vided by the law of inertia, which only provides privileged parametrisations 
of spatial paths, which we already took into account in the projective struc- 
ture of spacetime. Instead, an affine structure of spacetime may once more 
be motivated by another contingent property of Nature, shown by the ex- 
istence of elementary clocks (atomic frequencies) which do define the same 
uniformity structure on inertial world lines — all of them! Once more this is 
a highly non-trivial and very kind feature of Nature. In this way we would 
indeed arrive at the statement that spacetime is an affine space. However, 
as we shall discuss in this subsection, the affine group already emerges as 
automorphism group of inertial structures without the introduction of ele- 
mentary clocks. 

First we recall the main theorem of affine geometry. For that we make 
the following 

Definition 3. Three points in an affine space are called collinear iff they 
are contained in a single line. A map between affine spaces is called a 
collineation iff it maps each triple of collinear points to collinear points. 

Note that in this definition no other condition is required of the map, 
like, e.g., injectivity. The main theorem now reads as follows: 

Theorem 6. A bijective collineation of a real affine space of dimension 
n>2 is necessarily an affine map. 

A proof may be found in [6]. That the theorem is non-trivial can, e.g., be 
seen from the fact that it is not true for complex affine spaces. The crucial 
property of the real number field is that it does not allow for a non-trivial 
automorphisms (as field). 

A particular consequence of Theorem[6]is that bijective collineation are 
necessarily continuous (in the natural topology of affine space). This is 
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of interest for the applications we have in mind for the following reason: 
Consider the set P of all lines in some affine space S. P has a natural topology 
induced from S. Theorem[6] now implies that bijective collineations of S act 
as homeomorphism of P. Consider an open subset O C P and the subset of 
all collineations that fix O (as set, not necessarily its points). Then these 
collineations also fix the boundary 90 of D. in P. For example, if O is 
the set of all timelike lines in Minkowski space, i.e., with a slope less than 
some chosen value relative to some fixed direction, then it follows that the 
bijective collineations which together with their inverse map timelike lines 
to timelike lines also maps the lightcone to the lightcone. It immediately 
follows that it must be the composition of a Poincare transformation as a 
constant dilation. Note that this argument also works in two spacetime 
dimensions, where the Alexandrov-Zeeman result does not hold. 

The application we have in mind is to inertial motions, which are given 
by lines in affine space. In that respect Theorem[6]is not quite appropriate. 
Its hypotheses are weaker than needed, insofar as it would suffice to require 
straight lines to be mapped to straight lines. But, more importantly, the 
hypotheses are also stronger than what seems physically justifiable, insofar 
as not every line is realisable by an inertial motion. In particular, one would 
like to know whether TheoremfB] can still be derived by restricting to slow 
collineations, which one may define by the property that the corresponding 
lines should have a slope less than some non-zero angle (in whatever mea- 
sure, as long as the set of slow lines is open in the set of all lines) from a 
given (time-)direction. This is indeed the case, as one may show from going 
through the proof of Theorem[6l Slightly easier to prove is the following: 

Theorem 7. Let }- be a bijection of real n- dimensional affine space that 
maps slow lines to slow lines, then F is an affine map. 

A proof may be found in [26]. If 'slowness' is defined via the lightcone 
of a Minkowski metric g, one immediately obtains the result that the affine 
maps must be composed from Poincare transformations and dilations. The 
reason is 

Lemma 8. Let V be a finite dimensional real vector space of dimension 
n> 2 and g be a non- degenerate symmetric bilinear form on V of signature 
(l,Ti — 1). Let h be any other symmetric bilinear form on V. The 'light 
cones' for both forms are defined by £g := {v G V | g(v,v) = 0} and £h '■= 
{v G V I h,(v,v) = 0}. Suppose Cg C Ch, then h = ag for some a G M. 

Proof. Let {eo, ei , • • • , be a basis of V such that gab '■= gl^Q) ^b) = 

diag(l , — 1 , • • • , — 1 ). Then (eoiea) G Cg for 1 < a < n— 1 implies (we write 
h-ab := h-(ea, ^b)): Hoa = and Hoo+h-aa = 0. Further, (\/2eo+eQ+eb) G Cg 
fori <a<b<rL— 1 then implies Hab = for a 7^ b. Hence h = a g with 
a = h-oo- □ 
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This can be applied as follows: If F : S — > S is affine and maps light- 
like lines to lightlike lines, then the associated linear map f : V — > V maps 
lightlike vectors to lightlike vectors. Hence h(v,v) := g(f(v),f(v)) vanishes 
if g(v,v) vanishes and therefore h, = ag by Lemma[8l Since f(v) is timelike 
if V is timelike, a is positive. Hence we may define f := f/\/a and have 
g(f '(v), f (v)) = g(v,v) for all v G v, saying that f is a Lorentz transfor- 
mation, f is the composition of a Lorentz transformation and a dilation by 

2.6 The impact of relativity 

As is well known, the two main ingredients in Special Relativity are the 
Principle of Relativity (henceforth abbreviated by PR) and the principle 
of the constancy of light. We have seen above that, due to Alexandrov's 
Theorem, the latter almost suffices to arrive at the Poincare group. In 
this section we wish to address the complementary question: Under what 
conditions and to what extent can the RP alone justify the Poincare group? 

This question was first addressed by Ignatowsky [30], who showed that 
under a certain set of technical assumptions (not consistently spelled out by 
him) the RP alone suffices to arrive at a spacetime symmetry group which is 
either the inhomogeneous Galilei or the inhomogeneous Lorentz group, the 
latter for some yet undetermined limiting velocity c. 

More precisely, what is actually shown in this fashion is, as we will 
see, that the relativity group must contain either the proper orthochronous 
Galilei or Lorentz group, if the group is required to comprise at least space- 
time translations, spatial rotations, and boosts (velocity transformations). 
What we hence gain is the group-theoretic insight of how these transforma- 
tions must combine into a common group, given that they form a group at 
all. We do not learn anything about other transformations, like spacetime 
reflections or dilations, whose existence we neither required nor ruled out at 
this level. 

The work of Ignatowsky was put into a logically more coherent form by 
Franck Rothe [21] [22] , who showed that some of the technical assumptions 
could be dropped. Further formal simplifications were achieved by Berzi & 
Gorini [7] . Below we shall basically follow their line of reasoning, except that 
we do not impose the continuity of the transformations as a requirement, but 
conclude it from their preservation of the inertial structure plus bijectivity. 
See also [2] for an alternative discussion on the level of Lie algebras. 

For further determination of the automorphism group of spacetime we 
invoke the following principles: 

STl: Homogeneity of spacetime. 

ST2: Isotropy of space. 
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ST3: 



Galilean principle of relativity. 



We take STl to mean that the sought-for group should include all transla- 
tions and hence be a subgroup of the general affine group. With respect to 
some chosen basis, it must be of the form M'* x G, where G is a subgroup 
of GL(4,M]. ST2 is interpreted as saying that G should include the set of 
all spatial rotations. If, with respect to some frame, we write the general 
element A € GL(4,M) in a 1 + 3 split form (thinking of the first coordinate 
as time, the other three as space), we want G to include all 



Finally, ST3 says that velocity transformations, henceforth called 'boosts', 
are also contained in G. However, at this stage we do not know how boosts 
are to be represented mathematically. Let us make the following assump- 
tions: 

Bl: Boosts B(v) are labelled by a vector v G Bc(M^), where Bc(M^) is 
the open ball in of radius c. The physical interpretation of v shall 
be that of the boost velocity, as measured in the system from which 
the transformation is carried out. We allow c to be finite or infinite 
(Boo(R^] = R^)- V = corresponds to the identity transformation, i.e. 
B(0) = id]R4. We also assume that v, considered as coordinate function 
on the group, is continuous. 

B2: As part of ST2 we require equivariance of boosts under rotations: 



The latter assumption allows us to restrict attention to boost in a fixed 
direction, say that of the positive x-axis. Once their analytical form is 
determined as function of v, where v = vCx, we deduce the general expression 
for boosts using (fT7|) and (fT6|) . We make no assumptions involving space 
reflectionslfl We now restrict attention to v = ve^. We wish to determine 
the most general form of B(v) compatible with all requirements put so far. 
We proceed in several steps: 

® Some derivations in the literature of the Lorentz group do not state the equivariance 
property (I17|l exphcitly, though they all use it (implicitly), usually in statements to the 
effect that it is sufficient to consider boosts in one fixed direction. Once this restriction 
is effected, a one-dimensional spatial reflection transformation is considered to relate 
a boost transformation to that with opposite velocity. This then gives the impression 
that reflection equivariance is also invoked, though this is not necessary in spacetime 
dimensions greater than two, for (|17|l allows to invert one axis through a 180-degree 
rotation about a perpendicular one. 




where D e SO (3) . 



(16) 



R(D) • B(v) • R(D"') = B(D -v] . 



(17) 
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1. Using an arbitrary rotation D around the x-axis, so that D • v = v, 
equation (fT7|) ahows to prove that 

where here we wrote the 4x4 matrix in a 2 + 2 decomposed form, 
(i.e. A(v) is a 2 X 2 matrix and I2 is the 2x2 unit-matrix). Applying 
(I17p once more, this time using a 7T-rotation about the y-axis, we learn 
that a is an even function, i.e. 

a(v) = a(-v) . (19) 

Below we will see that a(v) = 1 . 

2. Let us now focus on A(v), which defines the action of the boost in the 
t — x plane. We write 

We refer to the system with coordinates (t,x) as K and that with coor- 
dinates (t',x') as K'. From (j2U|) and the inverse (which is elementary 
to compute) one infers that the velocity v of K ' with respect to K and 
the velocity v' of K with respect to K' are given by 

V = - c(v)/d(v) , (21a) 
v' = - vd(v)/a(v) =: cp(v) . (21b) 

Since the transformation K' — > K is the inverse of K — > K', the function 
(p : (— c,c) — ) (— c,c) obeys 

A((p(v)) = (A(v))-^ . (22) 

Hence cp is a bijection of the open interval (— c, c) onto itself and obeys 

cp o (p = id(_c,c) • (23) 

3. Next we determine cp. Once more using (jl7p . where D is a 7T-rotation 
about the y-axis, shows that the functions a and d in (llSp are even and 
the functions b and c are odd. The definition (I21bp of <p then implies 
that (p is odd. Since we assumed v to be a continuous coordinatisation 
of a topological group, the map cp must also be continuous (since the 
inversion map, g 1— > is continuous in a topological group). A 
standard theorem now states that a continuous bijection of an interval 
of M onto itself must be strictly monotonic. Together with (|23p this 
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implies that cp is either the identity or minus the identity mapcj If it is 
the identity map, evaluation of (j22p shows that either the determinant 
of A(v) must equals —1, or that A(v) is the identity for all v. We 
exclude the second possibility straightaway and the first one on the 
grounds that we required A(v) be the identity for v = 0. Also, in that 
case, (f22]l implies A^(v) = id for all v G (— c,c). We conclude that 
cp = —id, which implies that the relative velocity of K with respect 
to K' is minus the relative velocity of K' with respect to K. Plausible 
as it might seem, there is no a priori reason why this should be so0. 
On the face of it, the RP only implies (I23|) . not the stronger relation 
(p(v] = —V. This was first pointed out in [7J. 

4. We briefly revisit (jl9p . Since we have seen that B(— vex) is the inverse 
of B(vex), we must have a(— v) = 1/a(v), so that (fT9]) implies a(v) = 
±1. But only a(v) = +1 is compatible with our requirement that B(0) 
be the identity. 

5. Now we return to the determination of A(v). Using ()2ip and cp = —id, 
we write 



and 

A(v) := det(A(v)) = a(v) [a(v) + vb(v)] . (25) 
Equation A(— v) = (A(v))^^ is now equivalent to 

a(-v) = a(v)/A(v) , (26a) 
b(-v) = -b(v)/A(v] . (26b) 



Since, as already seen, a is an even and b is an odd function, ()26p is 
equivalent to A(v) = 1, i.e. the unimodularity of B(v). Equation ([25]) 
then allows to express b in terms of a: 



b(v] = 



' -1 



a2(v 



(27) 



6. Our problem is now reduced to the determination of the single function 
a. This we achieve by employing the requirement that the composition 

The simple proof is as follows, where we write v' := cp(v] to save notation, so that 
(|23ll now reads v" = v. First assume that cp is strictly monotonically increasing, then 
v' > V implies v = v" > v', a contradiction, and v' < v implies v = v" < v', likewise 
a contradiction. Hence cp = id in this case. Next assume cp is strictly monotonically 
decreasing. Then cp := — cp is a strictly monotonically increasing map of the interval 
(— c,c) to itself that obeys (|23|l . Hence, as just seen, cp — id, i.e. cp — —id. 
Note that v and v' are measured with different sets of rods and clocks. 
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of two boosts in the same direction results again in a boost in that 
direction, i.e. 

A(v) • A(v') = A(v") . (28) 

According to (j24p each matrix A(v) has equal diagonal entries. Ap- 
plied to the product matrix on the left hand side of ([28]) this implies 
that v^'^(a^^(v) — 1 ) is independent of v, i.e. equal to some constant k 
whose physical dimension is that of an inverse velocity squared. Hence 
we have 

where we have chosen the positive square root since we require a(0) = 
1 . The other implications of (I28p are 

a(v)a(v')(1 - kvv') = a(v") , (30a) 
a(v)a(v')(1 +w') =v"a(v"), (30b) 

from which we deduce 

Conversely, ([291) and ([HI]) imply ([30]). We conclude that ([28]) is equiv- 
alent to (I^UI) and (ini). 

7. So far a boost in x direction has been shown to act non-trivially only 
in the t — x plane, where its action is given by the matrix that results 
from inserting (|27p and (|29p into 



A(v) = ( where a(v) = 1/7171^. (32) 

\-v a[v) a(v) , J 

• If k > we rescale t i— ) t := t/\/k and set \/kv := tan a. Then 
(|32p is seen to be a Euclidean rotation with angle a in the t — x 
plane. The velocity spectrum is the whole real line plus infin- 
ity, i.e. a circle, corresponding to a G [0,27t], where and 2n 
are identified. Accordingly, the composition law ()3ip is just or- 
dinary addition for the angle a. This causes several paradoxa 
when V is interpreted as velocity. For example, composing two 
finite velocities v,v' which satisfy w' = 1 /k results in v" = oo, 
and composing two finite and positive velocities, each of which is 
greater than 1 /\/k, results in a finite but negative velocity. In this 
way the successive composition of finite positive velocities could 
also result in zero velocity. The group G C GL(u, M) obtained 
in this fashion is, in fact, S0(4). This group may be uniquely 
characterised as the largest connected group of bijections of 
that preserves the Euclidean distance measure. In particular, it 
treats time symmetrically with all space directions, so that no 
invariant notion of time-orientability can be given in this case. 
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For k = the transformations are just the ordinary boosts of the 
Gahlei group. The velocity spectrum is the whole real line (i.e. v 
is unbounded but finite) and G is the Galilei group. The law for 
composing velocities is just ordinary vector addition. 

Finally, for k < 0, one infers from (j3ip that c := k is an 

upper bound for all velocities, in the sense that composing two 
velocities taken from the interval (— c, c) always results in a veloc- 
ity from within that interval. Writing T := ct, v/c =: |3 =: tanh p, 
and y = 1 /Vl - (3^, the matrix ([32]) is seen to be a Lorentz boost 
or hyperbolic motion in the T — x plane: 

t\ / y — |3y\ ('^\ _ ( coshp — sinhp\ /t 
xy V^Pt y ) \^/ 1 — sinhp coshp ) Ix 



The quantity 



(33) 



p := tanh"'' (v/c) = tanh^^P) (34) 



is called rapidit^^ If rewritten in terms of the corresponding 
rapidities the composition law ([5T]) reduces to ordinary addition: 
p" = p + p'. 

This shows that only the Galilei and the Lorentz group survive as can- 
didates for any symmetry group implementing the RP. Once the Lorentz 
group for velocity parameter c is chosen, one may fully characterise it by its 
property to leave a certain symmetric bilinear form invariant. In this sense 
we geometric structure of Minkowski space can be deduced. This closes the 
circle to where we started from in Section [2T3l 



2.7 Local versions 

In the previous sections we always understood an automorphisms of a struc- 
tured set (spacetime) as a bijection. Mathematically this seems an obvious 
requirement, but from a physical point of view this is less clear. The phys- 
ical law of inertia provides us with distinguished motions locally in space 
and time. Hence one may attempt to relax the condition for structure pre- 
serving maps, so as to only preserve inertial motions locally. Hence we ask 
the following question: What are the most general maps that locally map 
segments of straight lines to segments of straight lines? This local approach 
has been pursued by [20]. 

To answer this question completely, let us (locally) identify spacetime 
with where n > 2 and assume the map to be C^, that is, three times 

This term was coined by Robb , but the quantity was used before by others; compare 
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continuously differentiablecj So let U C M.^ be an open subset and de- 
termine all maps f : U — > M"^ that map straight segments in U into 
straight segments in M^. In coordinates we write x = (x\ • • • ,x^) G U and 
V = ,'y"') S f(U) C M"-, so that := f^(x). A straight segment 

in U is a curve y : I — ) U (the open interval I C R is usually taken to 
contain zero) whose acceleration is pointwise proportional to its velocity. 
This is equivalent to saying that it can be parametrised so as to have zero 
acceleration, i.e., 7(5] = as + b for some a, b G M"-. 

For the image path f o y to be again straight its acceleration, (f " o 
y)(a, a), must be proportional to its velocity, (f oy)(a), where the factor 
of proportionality, C, depends on the point of the path and separately on a. 
Hence, in coordinates, we have 

^%[as + b)a^a'' = f|t(as + b)a^ C[as + b, a) (35) 

For each b this must be valid for all (a, s) in a neighbourhood of zero in 
M^xM. Taking the second derivatives with respect to a, evaluation at a = 0, 
s = leads to 

fAa = r^af^., (36a) 



where 



r^^:= Sl^\,^ + 8Z^\^^ (36b) 
9Cf-,a 



^1^ 



da"" 



(36c) 

Q=0 



Here we suppressed the remaining argument b. Equation (|36|) is valid at each 
point in U. Integrability of (I36ap requires that its further differentiation is 
totally symmetric with respect to all lower indices (here we use that the map 
f is C^). This leads to 

■= 9pr^T + V^y - (13 T) = . (37) 

Inserting (|36bp one can show (upon taking traces over (J.a and (^y) that the 
resulting equation is equivalent to 

i^'a.p =4'a4'(3- (38) 

In particular xj^ccp = U'p.a so that there is a local function i|; : U — > M (if 
U is simply connected, as we shall assume) for which = ^,a- Equation 
(|38l) is then equivalent to 9oc9|3 exp(— 4)) = so that ij^lx) = — ln(p • x + q) 
for some p G and q G R. Using i|jcr = 4", a and (f38|) . equation (|36ap is 



This requirement distinguishes the present (local) from the previous (global) approaches, 
in which not even continuity needed to be assumed. 
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equivalent to 9A9cr[f *^exp(— ij;)] = 0, which finally leads to the result that 
the most general solution for f is given by 

, A • X + a , ^ 

f (x) = — . 39 

p •x+ q 

Here A is a n x n matrix, a and q vectors in M"^, and q G M. p and q 
must be such that U does not intersect the hyperplane H(p, q) := {x G \ 
p • x + q =0} where f becomes singular, but otherwise they are arbitrary. Iff 
^^IPi^l) 7^ 0) iff P 7^ 0; the transformations (I39p are not affine. In this 
case they are called proper projective. 

Are there physical reasons to rule out such proper projective transfor- 
mations? A structural argument is that they do not leave any subset of M"' 
invariant and that they hence cannot be considered as automorphism group 
of any subdomain. A physical argument is that two separate points that 
move with the same velocity cease to do so if their worldlines are trans- 
formed by by a proper projective transformation. In particular, a rigid 
motion of an extended body (undergoing inertial motion) ceases to be rigid 
if so transformed (cf.[l7j, p. 16). An illustrative example is the following: 
Consider the one-parameter (a) family of parallel lines x(s, cr) = seo + crei 
(where s is the parameter along each line), and the proper projective map 
f(x) = x/(— eo • X + 1) which becomes singular on the hyperplane x^ = 1. 
The one-parameter family of image lines 

y(s,cT):=f(x(s,a))=^^^±^ (40) 

1 — s 

have velocities 

qeo + crei 

dsy{s,a) = -^y—-^ (41) 

whose directions are independent of s, showing that they are indeed straight. 
However, the velocity directions now depend on a, showing that they are 
not parallel anymore. 

Let us, regardless of this, for the moment take seriously the transforma- 
tions (|39|) . One may reduce them to the following form of generalised boosts, 
discarding translations and rotations and using equivariance with respect to 
the latter (we restrict to four spacetime dimensions from now on): 

./ ^ a(v)t + b(v)rvx) 

A(v) + B(v)t + D(v)(v-x) ' ^ ' 

- d(v)vt + e(v)x|| 
^11 ~ A(v) + B(v)t + D(v)(v-x) ' ^ ^ 

^' = (42,) 

^ A(v) + B(v)t + D(v)(v-x) ■ ^ ' 
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where v G represents the boost velocity, v := ||v|| its modulus, and all 
functions of v are even. The subscripts || and _L refer to the components 
parallel and perpendicular to v. Now one imposes the following conditions 
which allow to determine the eight functions a, b, d, e, f , A, B, D, of which 
only seven are considered independent since common factors of the numer- 
ator and denominator cancel (we essentially follow [38j): 

1. The origin x' = has velocity v in the unprimed coordinates, leading 
to e(v) = — d(v) and thereby eliminating e as independent function. 

2. The origin x = has velocity — v in the primed coordinates, leading 
to d(v) = — a(v) and thereby eliminating d as independent function. 

3. Reciprocity: The transformation parametrised by — v is the inverse 
of that parametrised by v, leading to relations A = A(a,b,v), B = 
B(D,a,b,v), and f = A, thereby eliminating A,B,f as independent 
functions. Of the remaining three functions a, b,D an overall factor 
in the numerator and denominator can be split off so that two free 
functions remain. 



4. Transitivity: The composition of two transformations of the type (j42p 
with parameters v and v' must be again of this form with some param- 
eter v"(v, v'], which turns out to be the same function of the velocities 
V and v' as in Special Relativity (Einstein's addition law), for reasons 
to become clear soon. This allows to determine the last two func- 
tions in terms of two constants c and R whose physical dimensions 
are that of a velocity and of a length respectively. Writing, as usual, 
y (v) := 1 / yM— v^/c^ the final form is given by 

t' = y(v)(t-v-x/c^)^ ^ ^^3^^ 



x'l = . ^ . (43c) 

1 - (y(v) -l)ct/R + y(v)v-x/Rc ^ ^ 

In the limit as R — > oo this approaches an ordinary Lorentz boost: 

L(v) : (t,X||,xx) ^ (y(v)(t-v-x/c^),y(v)(x|| -vt), x^) . (44) 



1 - (y(v) 


- l)ct/R + y(v)v • 


x/Rc 




y(v)(x|| - vt) 




1 - (y(v] 


- l)ct/R + y(v)v • 


■x/Rc 








1 - (y(v) 


- l)ct/R + y(v)v • 


x/Rc 



Moreover, for finite R the map (I43p is conjugate to (I44p with respect to a time 
dependent deformation. To see this, observe that the common denominator 
in (ji3]) is just (R + ct)/(R + ct'), whereas the numerators correspond to ([ill) . 
Hence, introducing the deformation map 
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and denoting the map (t,x) h-> (t',x') in (ji3]) by f, we have 

f = 4) o L(v) o 4)-^ . 



(46) 



Note that is singular at the hyperplane t = R/c and has no point of the 
hyperplane t = —R/c in its image. The latter hyperplane is the singularity 
set of 4)^^ Outside the hyperplanes t = itR/c the map 4^ relates the 
following time slabs in a diffeomorphic fashion: 

<t<R/c^ <t<oo, (47a) 
R/c < t <oo -oo < t < -R/c, (47b) 

-oo < t < ^ -R/c < t < . (47c) 

Since boosts leave the upper-half spacetime, t > 0, invariant (as set), (j47ap 
shows that f just squashes the linear action of boosts in < t < oo into 
a non-linear action within < t < R/c, where R now corresponds to an 
invariant scale. Interestingly, this is the same deformation of boosts that 
have been recently considered in what is sometimes called Doubly Special 
Relativity (because there are now two, rather than just one, invariant scales, 
R and c), albeit there the deformation of boosts take place in momentum 
space where R then corresponds to an invariant energy scale; see [37] and 
also [32]. 

3 Selected structures in Minkowski space 

In this section we wish to discuss in more detail some of the non-trivial struc- 
tures in Minkowski. I have chosen them so as to emphasise the difference to 
the corresponding structures in Galilean spacetime, and also because they 
do not seem to be much discussed in other standard sources. 

3.1 Simultaneity 

Let us start right away by characterising those vectors for which we have an 
inverted Cauchy-Schwarz inequality: 

Lemma 9. Let V be of dimension n > 2 and v gV be some non-zero vector. 
The strict inverted Cauchy-Schwarz inequality, 

v^w^ < (v • w)^ , (48) 

holds for all w gV linearly independent ofv iffv is timelike. 

Proof. Obviously v cannot be spacelike, for then we would violate (|48p with 
any spacelike w. If v is lightlike then w violates (I48p iff it is in the set 
V"*" — span{v}, which is non-empty iff n > 2. Hence v cannot be lightlike if 
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n > 2. If V is timelike we decompose w = av + w' with w' G V"*- so that 
w'^ < 0, with equality iff v and w are linearly dependent. Hence 

(v-w)^-vV = -v^w'^>0, (49) 

with equality iff v and w are linearly dependent. □ 

The next Lemma deals with the intersection of a causal line with a light 
cone, a situation depicted in Fig.[Tl 

Lemma 10. Let Cp be the light- doublecone with vertex p and £ := {r + Av | 
r G R} 6e a non-spacelike line, i.e. > 0, through r Cp. Ifvis timelike 
t D Cp consists of two points. If v is lightlike this intersection consists of 
one point if^ — r v-*- and is empty i/ p — r G v-*- . Note that the latter two 
statements are independent of the choice of r £ I — as they must be — , i.e. 
are invariant under r ^ r' := r -\- (xv, where a G M. 

Proof. We have r + Av G £p iff 

(r + Av-p)2 = ^ AV + 2Av • (r-p) + (r-p)^ = 0. (50) 
For V timelike we have v^ > and (jSOp has two solutions 




Indeed, since r Cp, the vectors v and r— p cannot be linearly dependent so 
that Lemma[9] implies the positivity of the expression under the square root. 
If V is lightlike (I50p becomes a linear equation which is has one solution if 
V • (r — p) 7^ and no solution if v • (r — p) = [note that (r — p)^ / since 
q £p by hypothesis] . □ 

Proposition 11. Let I and Cp as in LemmaHTU with v timelike. Let q+ and 
q_ 6e the two intersection points of I with Cp and q G £ a point between 
them. Then 

llq-p||g = llq+"q||gllq-q-llg- (52) 

Moreover, ||q_|- — q||g = ||q — q-||g iffv — (\ is perpendicular to v. 

Proof. The vectors (q+ — p) = (q — p) + (q+ — q) and (q_ — p) = (q — p] + 
(q_ — q) are lightlike, which gives (note that q — p is spacelike): 

I|q-Pllg = -(q-p)^ = (q+-q)^ + 2(q-p)-(q+-q), (53a) 
I|q-Pllg = -(q-p)^ = (q--q)^ + 2(q-p)-(q_-q). (53b) 

Since q_|_ — q and q — q_ are parallel we have q+ — q = A(q — q_) with 
A G M+ so that (q+ — q)^ = A||q+ — q||g||q — q-||g and A(q_ — q)^ = 
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Figure 1: A timelike line £ = {r + Av | A G M} intersects the light-cone with 
vertex p £ in two points: c|+, its intersection with the future light-cone 
and q_, its intersection with past the light cone, q is a point in between q + 
and q_. 

||q+ — q||g||q — cj-llg- Now, multiplying (153bp with A and adding this to 
(j53ap immediately yields 

(1 +A) ||q -p||2 = (1 +A) ||q+-q||g||q-q_||g. (54) 

Since 1 + A 7^ this implies (j52p . Finally, since q+ — q and q_ — q are 
antiparallel, ||q+ — q||g = ||q_ — q||g iff (q+ — q) = — (q_ — q). Equations 
([53|) now show that this is the case iff (q — p) • (q-t — q) =0, i.e. iff 
(q — p) • V = 0. Hence we have shown 

l|q+- q||g = l|q - q-llg ^ (q-p)-v = o. (55) 

In other words, q is the midpoint of the segment q+q_ iff the line through 
p and q is perpendicular (wrt. g) to I. □ 

The somewhat surprising feature of the first statement of this proposition 
is that ()52p holds for any point of the segment q+q-, not just the midpoint, 
as it would have to be the case for the corresponding statement in Euclidean 
geometry. 

The second statement of Proposition[TT] gives a convenient geometric 
characterisation of Einstein-simultaneity. Recall that an event q on a time- 
like line I (representing an inertial observer) is defined to be Einstein- 
simultaneous with an event p in spacetime iff q bisects the segment q+q^ 
between the intersection points q-|_, q_ of £ with the double-lightcone at p. 
Hence Proposition[TT] implies 
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Corollary 12. Einstein simultaneity with respect to a timelike line £ is an 
equivalence relation on spacetime, the equivalence classes of which are the 
spacelike hyperplanes orthogonal (wrt. g) to I. 

The first statement simply follows from the fact that the family of parallel 
hyperplanes orthogonal to I form a partition (cf. Sect. lA.ll) of spacetime. 

From now on we shall use the terms 'timelike line' and 'inertial observer' 
synonymously. Note that Einstein simultaneity is only defined relative to 
an inertial observer. Given two inertial observers, 

I = {r + Av I A G M} first observer , (56a) 

I' = [x' + A'v' I A' G M} second observer , (56b) 

we call the corresponding Einstein-simultaneity relations I- simultaneity and 
V -simultaneity. Obviously they coincide iff I and I' are parallel (v and v' are 
linearly dependent). In this case q' G V is ^-simultaneous to q G £ iff q G ^ 
is £ '-simultaneous to q' G V . If I and £'are not parallel (skew or intersecting 
in one point) it is generally not true that if q' G is ^-simultaneous to q G £ 
then q G ^ is also ^'-simultaneous to q' G V . In fact, we have 

Proposition 13. Let I and I' two non-parallel timelike likes. There exists 
a unique pair (q, q') G € x V so that q' is i- simultaneous to q and q is V 
simultaneous to c[' . 

Proof. We parameterise £ and £' as in (j56p . The two conditions for q' being 
^-simultaneous to q and q being £ '-simultaneous to q' are (q — q') • v = = 
(q — q') • v'. Writing q = r -|- Av and q' = r' -|- A'v' this takes the form of 
the following matrix equation for the two unknowns A and A': 

[j.y' ) (^/) = {]^, _ l] . • (57) 

This has a unique solution pair (A, A'), since for linearly independent timelike 
vectors v and v' Lemma[9] implies (v • v')'^ — v'^'^ > 0. Note that if I and V 
intersect q = q' = intersection point. □ 

Clearly, Einstein-simultaneity is conventional and physics proper should 
not depend on it. For example, the fringe-shift in the Michelson-Morley 
experiment is independent of how we choose to synchronise clocks. In fact, 
it does not even make use of any clock. So what is the general definition 
of a 'simultaneity structure' ? It seems obvious that it should be a relation 
on spacetime that is at least symmetric (each event should be simultaneous 
to itself). Going from one-way simultaneity to the mutual synchronisation 
of two clocks, one might like to also require reflexivity (if p is simultaneous 
to q then q is simultaneous to p), though this is not strictly required in 
order to one-way synchronise each clock in a set of clocks with one preferred 
'master clock', which is sufficient for many applications. 
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Moreover, if we like to speak of the mutual simultaneity of sets of more 
than two events we need an equivalence relation on spacetime. The equiv- 
alence relation should be such that each inertial observer intersect each 
equivalence class precisely once. Let us call such a simultaneity structure 
'admissible'. Clearly there are zillions of such structures: just partition 
spacetime into any set of appropriate^ spacelike hypersurfaces (there are 
more possibilities at this point, like families of forward or backward light- 
cones). An absolute admissible simultaneity structure would be one which 
is invariant (cf. Sect. lA.l]) under the automorphism group of spacetime. We 
have 

Proposition 14. There exits precisely one admissible simultaneity struc- 
ture which is invariant under the inhomogeneous proper orthochronous 
Galilei group and none that is invariant under the inhomogeneous proper 
orthochronous Lorentz group. 

A proof is given in [21] . There is a group-theoretic reason that highlights 
this existential difference: 

Proposition 15. Let G be a group with transitive action on a set S. Let 

Stab(p) C G be the stabiliser subgroup for p G S (due to transitivity all 
stabiliser subgroups are conjugate). Then S admits a G-invariant equivalence 
relation R C S x S ij^ Stab(p] is not maximal, that is, ijff Stab(p) is properly 
contained in a proper subgroup H of G: Stab(p) H C G. 

A proof of this may be found in (31] (Theorem 1.12). Regarding the 
action of the inhomogeneous Galilei and Lorentz groups on spacetime, their 
stabilisers are the corresponding homogeneous groups. Now, the homoge- 
neous Lorentz group is maximal in the inhomogeneous one, whereas the 
homogeneous Galilei group is not maximal in the inhomogeneous one, since 
it can still be supplemented by time translations without the need to also 
invoke space translations!^ This, according to Proposition llSl is the group 
theoretic origin of the absence of any invariant simultaneity structure in the 
Lorentzian case. 

However, one may ask whether there are simultaneity structures rela- 
tive to some additional structure X. As additional structure, X, one could, 
for example, take an inertial reference frame, which is characterised by a 
foliation of spacetime by parallel timelike lines. The stabiliser subgroup of 
that structure within the proper orthochronous Poincare group is given by 
the semidirect product of spacetime translations with all rotations in the 

^'^ For example, the hypersurfaces should not be asymptotically hyperboloidal, for then a 
constantly accelerated observer would not intersect all of them. 

The homogeneous Galilei group only acts on the spatial translations, not the time 
translations, whereas the homogeneous Lorentz group acts irreducibly on the vector 
space of translations. 
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hypersurfaces perpendicular to the lines in X: 

Stabx(ILor-r+) = x S0(3) . (58) 

Here the SO (3) only acts on the spatial translations, so that the group is 
also isomorphic to M x E(3), where E(3) is the group of Euclidean motions in 
3-dimensions (the hyperplanes perpendicular to the lines in X) . We can now 
ask: how many admissible Stabx(ILor|_(_) - invariant equivalence relations 
are there. The answer is 

Proposition 16. There exits precisely one admissible simultaneity structure 
which is invariant under Stabx(ILor^+), where X represents am inertial ref- 
erence frame (a foliation of spacetime by parallel timelike lines). It is given 
by Einstein simultaneity, that is, the equivalence classes are the hyperplanes 
perpendicular to the lines in X. 

The proof is given in [21]. Note again the connection to quoted group- 
theoretic result: The stabiliser subgroup of a point in Stabx(ILor-^+) is S0(3), 
which is clearly not maximal in Stabx(ILor^-|_) since it is a proper subgroup 
of E(3) which, in turn, is a proper subgroup of Stabx(ILor^+). 

3.2 The lattices of causally and chronologically complete sets 

Here we wish to briefly discuss another important structure associated with 
causality relations in Minkowski space, which plays a fundamental role in 
modern Quantum Field Theory (see e.g. p7J). Let Si and S2 be subsets of 
M"-. We say that Si and S2 are causally disjoint or spacelike separated iff 
Pi —V2 is spacelike, i.e. (pi — pi)^ < 0, for any pi G Si and p2 G S2. Note 
that because a point is not spacelike separated from itself, causally disjoint 
sets are necessarily disjoint in the ordinary set-theoretic sense — the converse 
being of course not true. 

For any subset S C M"^ we denote by S ' the largest subset of M"' which 
is causally disjoint to S. The set S' is called the causal complement of S. 
The procedure of taking the causal complement can be iterated and we set 
S" := (S')' etc. S" is called the causal completion of S. It also follows 
straight from the definition that Si C S2 implies S\ ^ S'2 and also S" D S. 
If S" = S we call S causally complete. We note that the causal complement 
S' of any given S is automatically causally complete. Indeed, from S" 5 S 
we obtain (S')" ^ S', but the first inclusion applied to S' instead of S leads 
to [S')" D S', showing [S')" = S'. Note also that for any subset S its causal 
completion, S", is the smallest causally complete subset containing S, for if 
S C K C S" with K" = K, we derive from the first inclusion by taking " 
that S" C K, so that the second inclusion yields K = S". Trivial examples 
of causally complete subsets of M"^ are the empty set, single points, and the 
total set M^. Others are the open diamond-shaped regions ([TS]) as well as 
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their closed counterparts: 

U(p , q ) := (C+ n Cq ) U (C+ n ) . (59) 

We now focus attention to the set Caus(M"^) of causally complete sub- 
sets of M"', including the empty set, 0, and the total set, M"^, which are 
mutually causally complementary. It is partially ordered by ordinary set- 
theoretic inclusion (C) (cf. Sect. lA.l]) and carries the 'dashing operation' (') 
of taking the causal complement. Moreover, on Caus(M"') we can define the 
operations of 'meet' and 'join', denoted by A and V respectively, as follows: 
Let Si G Caus(M"^) where i = 1 ,2, then Si AS2 is the largest causally com- 
plete subset in the intersection Si n S2 and Si V S2 is the smallest causally 
complete set containing the union Si U S2. 

The operations of A and V can be characterised in terms of the ordinary 
set-theoretic intersection n together with the dashing-operation. To see this, 
consider two causally complete sets, Si where i = 1 ,2, and note that the set 
of points that are spacelike separated from Si and S2 are obviously given by 
S^' n $2, but also by (Si U S2)', so that 

S; nS2 = (Si US2)', (60a) 
Si nS2 = (S^ US^)'. (60b) 

Here (|60ap and ()60bp are equivalent since any St G Caus(M^) can be written 
as Si = P(, namely Pi = S(. If Si runs through all sets in Caus(M"^) so does 
Pi. Hence any equation that holds generally for all Si G Caus(M"^) remains 
valid if the Si are replaced by S(. 

Equation ()60bp immediately shows that Si n S2 is causally complete 
(since it is the ' of something). Taking the causal complement of (I60ap we 
obtain the desired relation for Si V S2 := (Si U S2)". Together we have 

Si AS2 = Si nS2, (61a) 
Si VS2 = (S; nS^)'. (61b) 

From these we immediately derive 

(Si AS2)' = S^ VS^, (62a) 
(Si VS2)' = Si' AS^. (62b) 

All what we have said so far for the set Caus(M"') could be repeated 
verbatim for the set Chron(M"^) of chronologically complete subsets. We say 
that Si and S2 are chronologically disjoint or non-timelike separated, iff Si H 
S2 = and (pi — P2)^ < for any pi G Si and p2 G S2. S', the chronological 
complement of S, is now the largest subset of M"^ which is chronologically 
disjoint to S. The only difference between the causal and the chronological 
complement of S is that the latter now contains lightlike separated points 
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outside S. A set S is chronologically complete iff S = S", where the dashing 
now denotes the operation of taking the chronological complement. Again, 
for any set S the set S' is automatically chronologically complete and S" is 
the smallest chronologically complete subset containing S. Single points are 
chronologically complete subsets. All the formal properties regarding A, 
and V stated hitherto for Caus(M"^) are the same for Chron(M"^). 

One major difference between Caus(M"^) and Chron(M"^) is that the 
types of diamond-shaped sets they contain are different. For example, the 
closed ones, (|59|) . are members of both. The open ones, p5|) . are contained in 
Caus(M"^) but not in Chron(M^). Instead, Chron(M"^) contains the closed 
diamonds whose 'equator^ have been removed. An essential structural 
difference between Caus(M"') and Chron(M"^) will be stated below, after we 
have introduced the notion of a lattice to which we now turn. 

To put all these formal properties into the right frame we recall the 
definition of a lattice. Let (L, <) be a partially ordered set and a, b any two 
elements in L. Synonymously with a < b we also write b > a and say that 
a is smaller than b, b is bigger than a, or b majorises a. We also write 
Q < b if a < b and a 7^ b. If, with respect to <, their greatest lower and 
least upper bound exist, they are denoted by a A b — called the 'meet of a 
and b' — and aVb — called the 'join of a and b' — respectively. A partially 
ordered set for which the greatest lower and least upper bound exist for any 
pair a,b of elements from L is called a lattice. 

We now list some of the most relevant additional structural elements 
lattices can have: A lattice is called complete if greatest lower and least 
upper bound exist for any subset K C L. If K = L they are called (the 
smallest element in the lattice) and 1 (the biggest element in the lattice) 
respectively. An atom in a lattice is an element a which majorises only 0, 
i.e. < a and if < b < a then b = or b = a. The lattice is called atomic 
if each of its elements different from majorises an atom. An atomic lattice 
is called atomistic if every element is the join of the atoms it majorises. An 
element c is said to cover a if a < c and if a < b < c either a = b or b = c. 
An atomic lattice is said to have the covering property if, for every element 
b and every atom a for which a A b = 0, the join a V b covers b. 

The subset {a, b,c} C L is called a distributive triple if 

aA(bVc) = (aAb)V(aAc) and (a, b,c) cyclically permuted , 

(63a) 

aV(bAc) = (aVb)A(aVc) and (a, b,c) cyclically permuted . 

(63b) 

Definition 4. A lattice is called distributive or Boolean if every triple 

^® By 'equator' we mean the (n — 2)-sphere in which the forward and backward Ught-cones 
in (|59p intersect. In the two-dimensional drawings the 'equator' is represented by just 
two points marking the right and left corners of the diamond-shaped set. 
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{a, b,c} is distributive. It is called modular if every triple {a,b,c} with 
a < b is distributive. 

It is straightforward to check from ()63p that modularity is equivalent to 
the following single condition: 

modularity <4 a V(b Ac) =bA(aVc) for all a, b, c G L s.t. a < b. (64) 

If in a lattice with smallest element and greatest element 1 a map 
L — ) L, a I— > a', exist such that 

a":=(a')' = a, (65a) 
a<b^b'<a', (65b) 
aAa' = 0, aVa' = l, (65c) 

the lattice is called orthocomplemented. It follows that whenever the meet 
and join of a subset {at | i G 1} (I is some index set) exist one has De Morgan's 
lawfll: 

(AieiaO' = Vieia(, (66a) 
(Vieiai)' = Aieia(. (66b) 

For orthocomplemented lattices there is a still weaker version of distribu- 
tivity than modularity, which turns out to be physically relevant in various 
contexts: 

Definition 5. An orthocomplemented lattice is called orthomodular if every 
triple {a, b, c} with a < b and c < b' is distributive. 

From ()64p and using that b A c = for b < c' one sees that this is 
equivalent to the single condition (renaming c to c'): 

orthomod. <4 a = b A (a V c') for all a, b, c G L s.t. a < b < c , 

(67a) 

^ a = bV(aAc') for ah a, b, c G L s.t. a > b > c , 

(67b) 

where the second line follows from the first by taking its orthocomplement 
and renaming a', b', c to a, b, c'. It turns out that these conditions can still 
be simplified by making them independent of c. In fact, ()67p are equivalent 
to 

orthomod. a = bA(aVb') for all a, b G L s.t. a < b , (68a) 
^ a = bV(aAb') for all a, b G L s.t. a > b . (68b) 

^"^ From these laws it also appears that the definition (|65c|l is redundant, as each of its two 
statements follows from the other, due to 0' = 1. 
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It is obvious that (I67p implies (I68p (set c = b). But the converse is also true. 
To see this, take e.g. ()68bp and choose any c < b. Then c' > b', a > b 
(by hypothesis), and a > a Ac' (trivially), so that a > b V (a Ac'). Hence 
a > b V (aAc') > b V (a Ab') = a, which proves (j67b|) . 

Complete orthomodular atomic lattices are automatically atomistic. In- 
deed, let b be the join of all atoms majorised by a / 0. Assume a 7^ b so 
that necessarily b < a, then ()68bp implies aAb' ^ 0. Then there exists 
an atom c majorised by aAb'. This implies c < a and c < b', hence also 
c ^ b. But this is a contradiction, since b is by definition the join of all 
atoms majorised by a. 

Finally we mention the notion of compatibility or commutativity, which 
is a symmetric, reflexive, but generally not transitive relation R on an or- 
thomodular lattice (cf. Sec. lA.ip . We write al^b for (a, b) E R and define: 

a^b ^ a = (aAb)V(aAb'), (69a) 
^ b = (bAa)V(bAa'). (69b) 

The equivalence of these two lines, which shows that the relation of being 
compatible is indeed symmetric, can be demonstrated using orthomodularity 
as follows: Suppose (fMiip holds; then b A a' = b A (b' V a') A (b V a') = 
b A (b' V q'), where we used the orthocomplement of (|69ap to replace a' in 
the first expression and the trivial identity b A (b V a') = b in the second 
step. Now, applying ([68b]) to b > aAb we get b = (b Aa) V [b A (b' Va')] = 
(b A a) V (b A a'), i.e. (f69bl) . The converse, (IM ^ (IM . is of course 
entirely analogous. 

From (|69p a few things are immediate: at]b is equivalent to al^b', at]b 
is implied by a < b or a < b', and the elements and 1 are compatible 
with all elements in the lattice. The centre of a lattice is the set of elements 
which are compatible with all elements in the lattice. In fact, the centre is 
a Boolean sublattice. If the centre contains no other elements than and 1 
the lattice is said to be irreducible. The other extreme is a Boolean lattice, 
which is identical to its own centre. Indeed, if (a, b,b') is a distributive 
triple, one has a = qAI = a A (b V b') = (a A b) V (aAb') ^ (IM . 

After these digression into elementary notions of lattice theory we come 
back to our examples of the sets Caus(M"^) Chron(M"^). Our statements 
above amount to saying that they are complete, atomic, and orthocomple- 
mented lattices. The partial order relation < is given by C and the extreme 
elements and 1 correspond to the empty set and the total set M^, the 
points of which are the atoms. Neither the covering property nor modularity 
is shared by any of the two lattices, as can be checked by way of elementary 
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counterexamples!^ In particular, neither of them is Boolean. However, in 
|15j it was shown that Chron(M"^) is orthomodular; see also [13] which deals 
with more general spacetimes. Note that by the argument given above this 
implies that Chron(M"^) is atomistic. In contrast, Caus(M"^) is definitely 
not orthomodular, as is e.g. seen by the counterexample given in Fig.ELj 
It is also not difficult to prove that Chron(M"^) is irreducible 1^1 

It is well known that the lattices of propositions for classical systems 
are Boolean, whereas those for quantum systems are merely orthomodular. 
In classical physics the elements of the lattice are measurable subsets of 
phase space, with < being ordinary set-theoretic inclusion C, and A and V 
being ordinary set-theoretic intersection n and union U respectively. The 
orthocomplement is the ordinary set-theoretic complement. In Quantum 
Mechanics the elements of the lattice are the closed subspaces of Hilbert 
space, with < being again ordinary inclusion, A ordinary intersection, and 
V is given by aVb := span{a, b}. The orthocomplement of a closed subset is 
the orthogonal complement in Hilbert space. For comprehensive discussions 
see [33] and @|. 

One of the main questions in the foundations of Quantum Mechanics is 
whether one could understand (derive) the usage of Hilbert spaces and com- 
plex numbers from somehow more fundamental principles. Even though it 
is not a priori clear what ones measure of fundamentality should be at this 
point, an interesting line of attack consists in deriving the mentioned struc- 
tures from the properties of the lattice of propositions (Quantum Logic). It 
can be shown that a lattice that is complete, atomic, irreducible, orthomod- 
ular, and that satisfies the covering property is isomorphic to the lattice of 
closed subspaces of a linear space with Hermitean inner product. The com- 
plex numbers are selected if additional technical assumptions are added. For 
the precise statements of these reconstruction theorems see H]. 

It is now interesting to note that, on a formal level, there is a similar 

An immediate counterexample for the covering property is this: Take two timehke 
separated points (i.e. atoms) p and q. Then {p} A {q} = whereas {p} V {q} is given 
by the closed diamond (f5^ . Note that this is true in Caus(M'^) and Chron(M"). But, 
clearly, {p} V {q} does not cover either {p} or {q}. 
^® Regarding this point, there are some conflicting statements in the literature. The 
first edition of |27| states orthomodularity of Chron(M'^) in Proposition 4.1.3, which is 
removed in the second edition without further comment. The proof offered in the first 
edition uses H68ap as definition of orthomodularity, writing Ki for a and K2 for b. The 
crucial step is the claim that any spacetime event in the set K2 A (Ki V K2) lies in K2 
and that any causal line through it must intersect either Ki or K2. The last statement 
is, however, not correct since the join of two sets (here Ki and K2) is generally larger 
than the domain of dependence of their ordinary set-theoretic union; compare Fig. (2] : 
(Generally, the domain of dependence of a subset S of spacetime M is the largest subset 
D(S) C M such that any inextensible causal curve that intersects D(S) also intersects 
S.) 

In general spacetimes M, the failure of irreducibility of Chron(M) is directly related to 
the existence of closed timelike curves; see [TH] . 



34 




Figure 2: The two figures show that Caus(M^) is not orthomodular. The 
first thing to note is that Caus(M"^) contains open (|15|) as well as closed 
([59|1 diamond sets. In the left picture we consider the join of a small closed 
diamond a with a large open diamond b'. (Closed sets are indicated by 
a solid boundary line.) Their edges are aligned along the lightlike line £. 
Even though these regions are causally disjoint, their causal completion is 
much larger than their union and given by the open (for n > 2) enveloping 
diamond a V b' framed by the dashed line. (This also shows that the join 
of two regions can be larger than the domain of dependence of their union; 
compare footnotell9i) . Next we consider the situation depicted on the right 
side. The closed double-wedge region b contains the small closed diamond 
a. The causal complement b' of b is the open diamond in the middle. aVb' 
is, according to the first picture, given by the large open diamond enclosed 
by the dashed line. The intersection of aVb' with b is strictly larger than 
a, the difference being the dark-shaded region in the left wedge of b below 
a. Hence a/bA(aVb'), in contradiction to ()68ap . 

transition in going from Galilei invariant to Lorentz invariant causality re- 
lations. In fact, in Galilean spacetime one can also define a chronological 
complement: Two points are chronologically related if they are connected 
by a worldline of finite speed and, accordingly, two subsets in spacetime are 
chronologically disjoint if no point in one set is chronologically related to a 
point of the other. For example, the chronological complement of a point 
p are all points simultaneous to, but different from, p. More general, it is 
not hard to see that the chronologically complete sets are just the subsets of 
some t = const, hypersurface. The lattice of chronologically complete sets is 
then the continuous disjoint union of sublattices, each of which is isomorphic 
to the Boolean lattice of subsets in M^. For details see [14) . 

As we have seen above, Chron(M^) is complete, atomic, irreducible, 
and orthomodular (hence atomistic). The main difference to the lattice of 
propositions in Quantum Mechanics, as regards the formal aspects discussed 
here, is that Chron(M"^) does not satisfy the covering property. Otherwise 
the formal similarities are intriguing and it is tempting to ask whether there 
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is a deeper meaning to this. In this respect it would be interesting to know 
whether one could give a lattice-theoretic characterisation for Chron(M) (M 
some fixed spacetime), comparable to the characterisation of the lattices of 
closed subspaces in Hilbert space alluded to above. Even for M = M'^ such 
a characterisation seems, as far as I am aware, not to be known. 

3.3 Rigid motion 

As is well known, the notion of a rigid body, which proves so useful in New- 
tonian mechanics, is incompatible with the existence of a universal finite 
upper bound for all signal velocities ^36]. As a result, the notion of a per- 
fectly rigid body does not exist within the framework of SR. However, the 
notion of a rigid motion does exist. Intuitively speaking, a body moves 
rigidly if, locally, the relative spatial distances of its material constituents 
are unchanging. 

The motion of an extended body is described by a normalised timelike 
vector field u : O — > W^, where O is an open subset of Minkowski space, 
consisting of the events where the material body in question 'exists'. We 
write g(u, u) = u • u = for the Minkowskian scalar product. Being 
normalised now means that = (we do not choose units such that 
c = 1). The Lie derivative with respect to u is denoted by Lu. 

For each material part of the body in motion its local rest space at the 
event p G O can be identified with the hyperplane through p orthogonal to 
Up: 

Hp:=p + u^. (70) 

Up carries a Euclidean inner product. Hp, given by the restriction of — g to 
Up . Generally we can write 

h = c"^u^«)u^-g, (71) 

where u'' = g^(u) := g(u, •] is the one- form associated to u. Following [9] 
the precise definition of 'rigid motion' can now be given as follows: 

Definition 6 (Born 1909). Let u be a normalised timelike vector field u. 
The motion described by its flow is rigid if 

L^h = 0. (72) 

Note that, in contrast to the Killing equations L^g = 0, these equations are 
non linear due to the dependence of h, upon u. 

We write ITh := id — c^^ u (8> u'' G End(R"^] for the tensor field over 
spacetime that pointwise projects vectors perpendicular to u. It acts on one 
forms a via nH(a) := aoFlH and accordingly on all tensors. The so extended 
projection map will still be denoted by 11^. Then we e.g. have 

h = -nHg :=-g(nH-,nH-). (73) 
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It is not difficult to derive the following two equations 
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Lf^h = fUh, (74) 
Uh = -U(nhg) = -nH(Ug) , (75) 

where f is any differentiable real-valued function on Q. 

Equation (|74p shows that the normalised vector field u satisfies (|72p 
iff any rescaling fu with a nowhere vanishing function f does. Hence the 
normalization condition for u in (j72p is really irrelevant. It is the geometry 
in spacetime of the flow lines and not their parameterisation which decide 
on whether motions (all, i.e. for any parameterisation, or none) along them 
are rigid. This has be the case because, generally speaking, there is no 
distinguished family of sections (hypersurfaces) across the bundle of flow 
lines that would represent 'the body in space', i.e. mutually simultaneous 
locations of the body's points. Distinguished cases are those exceptional 
ones in which u is hypersurface orthogonal. Then the intersection of u's 
flow lines with the orthogonal hypersurfaces consist of mutually Einstein 
synchronous locations of the points of the body. An example is discussed 
below. 

Equation ([75|) shows that the rigidity condition is equivalent to the 'spa- 
tially' projected Killing equation. We call the flow of the timelike normalised 
vector field u a Killing motion (i.e. a spacetime isometry) if there is a 
Killing field K such that u = cK/\/K^. Equation ()75p immediately implies 
that Killing motions are rigid. What about the converse? Are there rigid 
motions that are not Killing? This turns out to be a difficult question. 
Its answer in Minkowski space is: 'yes, many, but not as many as naively 
expected.' 

Before we explain this, let us give an illustrative example for a Killing 
motion, namely that generated by the boost Killing-field in Minkowski space. 
We suppress all but one spatial directions and consider boosts in x direction 
in two-dimensional Minkowski space (coordinates ct and x; metric ds^ = 
c^dt^ - dx^). The Killing field i|l 

K = x9ct + ct9x, (76) 

which is timelike in the region |x| > |ct|. We focus on the 'right wedge' 
X > |ct|, which is now our region O. Consider a rod of length € which at 



Equation simply follows from LulTh = -c ^u(g)LuU^ so that gflLuIlhjX, HhY) = 
for all X,Y. In fact, Uu^ = a\ where a := VuU is the spacetime-acceleration. This 
follows from Uu^X) = Lu{g(u, X)) - g(u, UX) = g(VuU, X) + g{u, VuX - [u, X]) = 
g(a, X) — g(u, Vxu) — g{a, X), where g(u, u) = const, was used in the last step. 
Here we adopt the standard notation from differential geometry, where 9^ := 9/3x^ 
denote the vector fields naturally defined by the coordinates {x'^}^=o ■ n-i ■ Pointwise 
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t = is represented by the interval x G (r,r + £), where r > 0. The flow of 
the normahsed field u = cK/ \/K^ is 

ct(T) = xo sinh(cT/xo) , (77a) 
x(t) = Xo cosh(cT/xo) , (77b) 

where xq = x(t = 0) G (r, r + I] labels the elements of the rod at t = 0. We 
have x^ — c^t^ = Xq, showing that the individual elements of the rod move 
on hyperbolae ('hyperbolic motion'). T is the proper time along each orbit, 
normalised so that the rod lies on the x axis at t = 0. 
The combination 

A := ct/xo (78) 

is just the fiow parameter for K (j76p . sometimes referred to as 'Killing time' 
(though it is dimensionless). From ()77p we can solve for A and t as functions 
of ct and x: 

A = f(ct,x) := tanh"'' (ct/x) , (79a) 
T = f(ct,x) := ^(x/cH^^ tanh-^ (ct/x) , (79b) 

' V ' 

xo/c 

from which we infer that the hypersurfaces of constant A are hyperplanes 
which all intersect at the origin. Moreover, we also have df = K^K^ (d is 
just the ordinary exterior differential) so that the hyperplanes of constant A 
intersect all orbits of u (and K) orthogonally. Hence the hyperplanes of con- 
stant A qualify as the equivalence classes of mutually Einstein-simultaneous 
events in the region x > |ct| for a family of observers moving along the 
Killing orbits. This does not hold for the hypersurfaces of constant T, which 
are curved. 

The modulus of the spacetime-acceleration (which is the same as the 
modulus of the spatial acceleration measured in the local rest frame) of the 
material part of the rod labelled by xq is 

||a||g = cVxo. (80) 

As an aside we generally infer from this that, given a timelike curve of local 
acceleration (modulus) a, infinitesimally nearby orthogonal hyperplanes in- 
tersect at a spatial distance c^/a. This remark will become relevant in the 
discussion of part 2 of the Noether-Herglotz theorem given below. 

In order to accelerate the rod to the uniform velocity v without deforming 
it, its material point labelled by xq has to accelerate for the eigentime (this 
follows from ((771) ) 

T = — tanh-^v/c) , (81) 
c 

which depends on xq. In contrast, the Killing time is the same for all material 
points and just given by the final rapidity. In particular, judged from the 
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local observers moving with the rod, a rigid acceleration requires accelerating 
the rod's trailing end harder but shorter than pulling its leading end. 

In terms of the coordinates (A,xo), which are co-moving with the flow 
of K, and (t,xo), which are co-moving with the flow of u, we just have 
K = 9/9A and u = 9/9t respectively. The spacetime metric g and the 
projected metric h in terms of these coordinates are: 

h = dx^ , (82a) 
g = x^ dA^ - dx^ = c^{dx - (t/xq) dxo)^ - dx^ . (82b) 

Note the simple form g takes in terms of Xq and A, which are also called the 
'Rindler coordinates' for the region |x| > |ct| of Minkowski space. They are 
the analogs in Lorentzian geometry to polar coordinates (radius xq, angle 
A) in Euclidean geometry. 

Let us now return to the general case. We decompose the derivative of 
the velocity one-form u'' := g-'-(u) as follows: 

Vu^ = e + cu-Fc"^u''®Q\ (83) 

where 9 and cu are the projected symmetrised and antisymmetrised deriva- 
tives respectivel3@ 

29 = UhiV V u^) = V V - c^^ V Q^ (84a) 
2cu = UhiV A u^) = V A - c"^ A (84b) 

The symmetric part, 9, is usually further decomposed into its traceless and 
pure trace part, called the shear and expansion of u respectively. The 
antisymmetric part cu is called the vorticity of u. 

Now recall that the Lie derivative of g is just twice the symmetrised 
derivative, which in our notation reads: 

Lug = VVu^ (85) 

This imphes in view of ([75]), and (fSial) 

Proposition 17. Letu be a normalised timelike vector field u. The motion 
described by its flow is rigid iff Vi is of vanishing shear and expansion, i.e. 
iffQ = 0. 



We denote the symmetrised and antisymmetrised tensor-product (not including the 
factor 1/n!) by V and A respectively and the symmetrised and antisymmetrised 
(covariant-) derivative by VV and VA. For example, (u'' A v^)ai, = UaVb — UbVa 
and (V V u'' Jab = VaUb + VbUa. Note that (V A u'') is the same as the ordinary 
exterior differential du'' . Everything we say in the sequel applies to curved spacetimes 
if V is read as covariant derivative with respect to the Levi-Civita connection. 
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Vector fields generating rigid motions are now classified according to 
whether or not they have a vanishing vorticity cu: if cu = the flow is called 
irrotational, otherwise rotational. The following theorem is due to Herglotz 
^ and Noether I40j: 

Theorem 18 (Noether &; Herglotz, parti). A rotational rigid motion in 
Minkowski space must be a Killing motion. 

An example of such a rotational motion is given by the Killing fielcP^ 

K = 9t+K9^ (86) 

inside the region 

a = {(t,z,p,(p) I Kp<c}, (87) 

where K is timelike. This motion corresponds to a rigid rotation with con- 
stant angular velocity k which, without loss of generality, we take to be 
positive. Using the co-moving angular coordinate i|j := (p — Kt, the split (j7ip 
is now furnished by 

= c^1-(Kp/c)2|cdt- ^_^^P^p^^^^^ pdi|;| , (88a) 

h = dz^ + dp^ + , , \ . (88b) 
1 - (kp/c)2 

The metric h, is curved (cf. Lemma fTOl) . But the rigidity condition (I72p 
means that h, and hence its curvature, cannot change along the motion. 
Therefore, even though we can keep a body in uniform rigid rotational mo- 
tion, we cannot put it into this state from rest by purely rigid motions, since 
this would imply a transition from a flat to a curved geometry of the body. 
This was first pointed out by Ehrenfest [19]. Below we will give a concise 
analytical expression of this fact (cf. equation ()92p ). All this is in contrast 
to the translational motion, as we will also see below. 

The proof of Theorem[TH] relies on arguments from differential geometry 
proper and is somewhat tricky. Here we present the essential steps, basically 
following [1^ and [IH] in a slightly modernised notation. Some straightfor- 
ward calculational details will be skipped. The argument itself is best broken 
down into several lemmas. 

At the heart of the proof lies the following general construction: Let 
M. be the spacetime manifold with metric g and O C M the open region 
in which the normalised vector field u is defined. We take O to be simply 
connected. The orbits of u foliate O and hence define an equivalence relation 
on D. given by p ~ q iff p and q lie on the same orbit. The quotient space 
ii := 0/~ is itself a manifold. Tensor fields on ii can be represented by (i.e. 

We now use standard cylindrical coordinates (z, p, cp), in terms of which ds^ — c^dt^ — 
dz^ - dp^ - p^ dcp2. 
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are in bijective correspondence to) tensor fields T on O which obey the two 
conditions: 

UuT = J, (89a) 
LuT = . (89b) 

Tensor fields satisfying (j89ap are called horizontal, those satisfying both 
conditions ()89p are called projectable. The (n — 1 ) -dimensional metric tensor 
H, defined in ([7T]) . is an example of a projectable tensor if u generates a 
rigid motion, as assumed here. It turns (ii,H) into a (n — 1 )-dimensional 
Riemannian manifold. The covariant derivative ^ with respect to the Levi- 
Civita connection of h. is given by the following operation on projectable 
tensor fields: 

^ := o V (90) 

i.e. by first taking the covariant derivative V (Levi-Civita connection in 
(M, g)) in spacetime and then projecting the result horizontally. This results 
again in a projectable tensor, as a straightforward calculation shows. 

The horizontal projection of the spacetime curvature tensor can now be 
related to the curvature tensor of O (which is a projectable tensor field). 
Without proof we state 

Lemma 19. Let u generate a rigid motion in spacetime. Then the hori- 
zontal projection of the totally covariant (i.e. all indices down) curvature 
tensor R of (O, g) is related to the totally covariant curvature tensor 6 of 
[tl,h) by the following equatioi^: 

nhR = -6-3(id"nA)cu®tu, (91) 

where TTa is the total antisymmetriser, which here projects tensors of rank 
four onto their totally antisymmetric part. 

Formula (I9ip is true in any spacetime dimension n. Note that the projector 
(id — TTa) guarantees consistency with the first Bianchi identities for R and 
^, which state that the total antisymmetrisation in their last three slots 
vanish identically. This is consistent with (I91|] since for tensors of rank four 
with the symmetries of cu ® cu the total antisymmetrisation on tree slots is 
identical to TTa, the symmetrisation on all four slots. The claim now simply 
follows from TTa o (id — TTa) = l^A ~ I^A = 0. 

We now restrict to spacetime dimensions of four or less, i.e. n < 4. 
In this case TTa ° l^h = since TTh makes the tensor effectively live over 
n — 1 dimensions, and any totally antisymmetric four-tensor in three or less 

ft appears with a minus sign on the right hand side of (|9ip because the first index on 
the hatted curvature tensor is lowered with h rather than g. This induces a minus sign 
due to (|7ip . i.e. as a result of our 'mostly- minus'-convention for the signature of the 
spacetime metric. 
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dimensions must vanish. Apphed to ()9ip this means that HaIc^ w) = 0, 
for horizontaUty of cu imphes w w = 17^(0; cu). Hence the right hand 
side of (fOTl) just contains the pure tensor product — 3 cu tu. 

Now, in our case R = since (M, g) is flat Minkowski space. This has two 
interesting consequences: First, (ii.H) is curved iff the motion is rotational, 
as exemplified above. Second, since 6 is projectable, its Lie derivative with 
respect to u vanishes. Hence (|9ip implies Ly_tu cu + cu 03 Lu_cu = 0, which 
is equivalent tcl^ 

Ucu = . (92) 

This says that the vorticity cannot change along a rigid motion in flat space. 
It is the precise expression for the remark above that you cannot rigidly set 
a disk into rotation. Note that it also provides the justification for the global 
classification of rigid motions into rotational and irrotational ones. 

A sharp and useful criterion for whether a rigid motion is Killing or not 
is given by the following 

Lemma 20. Let \l be a normalised timelike vector field on a region O C M. 
The motion generated by u is Killing iff it is rigid and is exact on O. 

Proof. That the motion generated by u be Killing is equivalent to the exis- 
tence of a positive function f : O — > M such that Lf^^g = 0, i.e. VV(fu'') = 0. 
In view of (I84ap this is equivalent to 

2e + (dlnf + c-2a^) Vu^ = 0, (93) 

which, in turn, is equivalent to 9 = and a'' = — c^dlnf. This is true 
since 9 is horizontal, 11^6 = 9, whereas the first term in (|93p vanishes 
upon applying Hh,- The result now follows from reading this equivalence 
both ways: 1) The Killing condition for K := fu implies rigidity for u and 
exactness of a''. 2) Rigidity of u and a'' = — dO imply that K := fu is 
Killing, where f := exp((I'/c^). □ 

We now return to the condition (I92p and express Lu.cu in terms of du^ 
For this we recall that Li^u'' = a'' (cf. footnotel2ip and that Lie derivatives 
on forms commute with exterior derivativeJ^. Hence we have 

2 Lucu = L^lHndu^ ) = U^da!' = da^ - V A L^a!' . (94) 

Here we used the fact that the additional terms that result from the Lie 
derivative of the projection tensor ITh vanish, as a short calculation shows, 
and also that on forms the projection tensor FFh can be written as ITh = 
id — c^^u'' Aiu, where iu denotes the map of insertion of u in the first slot. 
Now we prove 

^® In more than four spacetime dimensions one only gets (id— nA)(Lutu(8'cu+cu®l-uCu) ~ 0. 
^"^ This is most easily seen by recalling that on forms the Lie derivative can be written as 
Lu = d o iu + iu o d, where iu is the map of inserting u in the first slot. 
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Lemma 21. Let u generate a rigid motion in flat space such that tu / 0, 
then 

L^a^ = . (95) 

Proof. Equation (j92p says that cu is projectable (it is horizontal by defini- 
tion). Hence ^cu is projectable, which implies 

U^cu = . (96) 

Using ([83]) with 9 = one has 

= nnVtu = nnVVu^ - c-^U^iVv!' ® a^) . (97) 

Antisymmetrisation in the first two tensor slots makes the first term on the 
right vanish due to the flatness on V. The antisymmetrised right hand side 
is hence equal to — c^^tu a''. Taking the Lie derivative of both sides makes 
the left hand side vanish due to (f96]l . so that 

Lu(tu«)a^) = cu^Ua'' =0 (98) 

where we also used (f^ . So we see that LuO.^ = Oiftu/oil □ 

The last three lemmas now constitute a proof for Theoremll8l Indeed, 
using (j95p in (j94p together with (j92p shows da** = 0, which, according to 
Lemma [20l implies that the motion is Killing. 

Next we turn to the second part of the theorem of Noether and Herglotz, 
which reads as follows: 

Theorem 22 (Noether & Herglotz, part 2). All irrotational rigid motions 
in Minkowski space are given by the following construction: take a twice 
continuously differentiahle curve t h-> z(t) in Minkowski space, where w.l.o.g 
T is the eigentime, so that i?- = c^. Let := z(t) + (z(t]]-'- be the hyperplane 
through z(t) intersecting the curve z perpendicularly. Let CI be a the tubular 
neighbourhood of z in which no two hyperplanes Hrj-, Ht-/ intersect for any 
pair z(t),z(t') of points on the curve. In D. define u as the unique (once 
differentiable) normalised timelike vector field perpendicular to all Ht-PiO. 
The flow of u is the sought-for rigid motion. 

Proof. We first show that the flow so defined is indeed rigid, even though 
this is more or less obvious from its very definition, since we just defined it by 
'rigidly' moving a hyperplane through spacetime. In any case, analytically 
we have, 

= {x e I f (t, x) := z(t) • (x - z(t)) = 0} . (99) 

In O any x lies on exactly one such hyperplane, Ht, which means that there 
is a function a : Q — > M so that t = o[x) and hence F(x) := f(CT(x),x) = 0. 

We will see below that (|95p is generally not true if cu = 0; see equation (|107|l . 
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This imphes dF = 0. Using the expression for f from (I99p this is equivalent 
to 

da = o ct/[c^ - (£ o 0-) • (id - z o ct)] , (100) 

where 'id' denotes the 'identity vector- field', x h-) x^d^, in Minkowski space. 
Note that in Q we certainly have d^i[r,x) ^ and hence z • (x — z) 7^ c^. 
In O we now define the normalised timelike vector fielci^ 

u:=zoa. (101) 

Using (jlOOp . its derivative is given by 

Vu^ = dCT (z^ o a) = [(z^ o a) (z^ o a)] /(N^c^) , (102) 

where 

N := 1 - (zo 0-) • (id-zo cy)/c^. (103) 

This immediately shows that ITviVu'' = (since Tiyj}' = 0) and therefore 
that 9 = tu = 0. Hence u, as defined in (llOip . generates an irrotational 
rigid motion. 

For the converse we need to prove that any irrotational rigid motion 
is obtained by such a construction. So suppose u is a normalised timelike 
vector field such that 9 = tu = 0. Vanishing tu means rTh.(V A u'') = 
nH(du^) = 0. This is equivalent to u** A du** = 0, which according to the 
Frobenius theorem in differential geometry is equivalent to the integrability 
of the distributioiJ^ u'' = 0, i.e. the hypersurface orthogonality of u. We 
wish to show that the hypersurfaces orthogonal to u are hyperplanes. To 
this end consider a spacelike curve z(s), where s is the proper length, running 
within one hypersurface perpendicular to u. The component of its second 
s-derivative parallel to the hypersurface is given by (to save notation we now 
simply write u and u'' instead of u o z and u'' o z) 

nHZ = z-c"^uu^(z) = z + c"^u9(z,z) =z, (104) 

where we made a partial differentiation in the second step and then used 
9 = 0. Geodesies in the hypersurface are curves whose second derivative 
with respect to proper length have vanishing components parallel to the 
hypersurface. Now, (I104p implies that geodesies in the hypersurface are 
geodesies in Minkowski space (the hypersurface is 'totally geodesic'), i.e. 
given by straight lines. Hence the hypersurfaces are hyperplanes. □ 

^® Note that, by definition of a, (z o a) ■ (id — z o tr] = 0. 

'Distribution' is here used in the differential-geometric sense, where for a manifoid M. it 
denotes an assignment of a iinear subspace Vp in the tangent space Tp M to each point 
p of M. The distribution = is defined by Vp = {v G Tp M j iij, (v) = Up ■ v = 0}. A 
distribution is caUed (locally) integrable if (in the neighbourhood of each point) there 
is a submanifold M' of M whose tangent space at any p £ M' is just Vp. 
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Theorem[22] precisely corresponds to the Newtonian counterpart: The 
irrotational motion of a rigid body is determined by the worldhne of any 
of its points, and any timehke worldhne determines such a motion. We can 
rigidly put an extended body into any state of translational motion, as long 
as the size of the body is limited by c^/a, where a is the modulus of its 
acceleration. This also shows that (I95p is generally not valid for irrotational 
rigid motions. In fact, the acceleration one- form field for (jl01|) is 

= [z^ o a)/N (105) 

from which one easily computes 
da^ = (z^ o a] A ((nnz^ o a] + (z^ o z o cr) • (id-zoa) | _ 

(106) 

Prom this one sees, for example, that for constant acceleration, defined by 
TTh.'z = (constant acceleration in time as measured in the instantaneous 
rest frame), we have da^ = and hence a Killing motion. Clearly, this is 
just the motion ([77|) for the boost Killing field ([7S|) . The Lie derivative of 
a** is now easily obtained: 

Ua^ = i^da^ = (nnz^ o a]^-^ , (107) 

showing explicitly that it is not zero except for motions of constant acceler- 
ation, which were just seen to be Killing motions. 

In contrast to the irrotational case just discussed, we have seen that 
we cannot put a body rigidly into rotational motion. In the old days this 
was sometimes expressed by saying that the rigid body in SR has only three 
instead of six degrees of freedom. This was clearly thought to be paradoxical 
as long as one assumed that the notion of a perfectly rigid body should also 
make sense in the framework of SR. However, this hope was soon realized 
to be physically untenable [36j . 



A Appendices 

In this appendix we spell out in detail some of the mathematical notions 
that were used in the main text. 



A.l Sets and group actions 

Given a set S, recall that an equivalence relation is a subset R C S x S 
such that for all p,q,r G S the following conditions hold: 1) (p,p) € R 
(called 'reflexivity'), 2) if (p,q) G R then (q,p) G R (called 'symmetry'), 
and 3) if (p,q) G R and (q,r) G R then (p,r) G R (called 'transitivity'). 
Once R is given, one often conveniently writes p ~ q instead of (p, q) G R. 
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Given p G S, its equivalence class, [p] C S, is given by all points R-related 
to p, i.e. [p] := {q e S 1 (p, q] e R}. One easily shows that equivalence 
classes are either identical or disjoint. Hence they form a partition of S, that 
is, a covering by mutually disjoint subsets. Conversely, given a partition 
of a set S, it defines an equivalence relation by declaring two points as 
related iff they are members of the same cover set. Hence there is a bijective 
correspondence between partitions of and equivalence relations on a set S. 
The set of equivalence classes is denoted by S/R or S/~. There is a natural 
surjection S S/R, p M- [p]. 

If in the definition of equivalence relation we exchange symmetry for 
antisymmetry, i.e. (p,q) G R and (q,p) G R implies p = q, the relation is 
called a partial order, usually written as p > q for (p, q) G R. If, instead, 
reflexivity is dropped and symmetry is replaced by asymmetry, i.e. (p, q) G 
R implies (q,p) ^ R, one obtains a relation called a strict partial order, 
usually denoted by p > q for (p, q) G R. 

An left action of a group G on a set S is a map c|) : G x S — > S, such that 
ct)(e,s) = s (e = group identity) and 4)(gh, s) = cj)(g, ct)(h., s)). If instead of 
the latter equation we have 4)(gh,, s) = 4)(h., (|)(g, s)) one speaks of a right 
action. For left actions one sometimes conveniently writes (()(g, s) =: g-s, for 
right actions c|)(g,s) =: s • g. An action is called transitive if for every pair 
(s, s') £ S X S there is a g G G such that cj)(g, s) = s', and simply transitive 
if, in addition, (s,s') determine g uniquely, that is, 4)(g,s) = 4)(g',s) for 
some s implies g = g'. The action is called effective if 4>(g)S) = s for all s 
implies g = e ('every g / e moves something') and free if (t)(g,s) = s for 
some s implies g = e ('no g / e has a fixed point'). It is obvious that simple 
transitivity implies freeness and that, conversely, freeness and transitivity 
implies simple transitivity. Moreover, for Abelian groups, effectivity and 
transitivity suffice to imply simple transitivity. Indeed, suppose g • s = g' • s 
holds for some s G S, then we also have k- (g-s) = k- (g' • s) for all k G G and 
hence g-(k-s) = g'-(k-s)by commutativity. This implies that g • s = g ' • s 
holds, in fact, for all s. 

For any s G S we can consider the stabilizer subgroup 

Stab(s) :={g G G I (l)(g,s) =s}C G. (108) 

If cj) is transitive, any two stabilizer subgroups are conjugate: Stab(g ■ s) = 
gStab(s)g^^. By definition, if 4^ is free all stabilizer subgroups are trivial 
(consist of the identity element only). In general, the intersection G' := 
Stab(s) C G is the normal subgroup of elements acting trivially on S. 
If 4) is an action of G on S, then there is an effective action cj) of 6 := G/G' 
on S, defined by 4)([g], s) := 4)(g, s), where [g] denotes the G'-coset of G' in 
G. 

The orbit of s in S under the action cf) of G is the subset 

Orb(s) :={4)(g,s) I g G G}CS. (109) 



46 



It is easy to see that group orbits are either disjoint or identical. Hence they 
define a partition of S, that is, an equivalence relation. 

A relation R on S is said to be invariant under the self map f : S — > S if 
(p, q] G R <^ (f(p),f(c|)) G R. It is said to be invariant under the action c[) 
of G on S if (p, q) e R <4> (4)(g,p], c|)(g, q)) G R for all g G G. If R is such a 
G-invariant equivalence relation, there is an action cj)' of G on the set S/R 
of equivalence classes, defined by c|)'(g, [p]) := [4)(g,p)]. A general theorem 
states that invariant equivalence relations exist for transitive group actions, 
iff the stabilizer subgroups (which in the transitive case are all conjugate) 
are maximal (e.g. Theorem 1.12 in [31j). 

A. 2 AIRne spaces 

Definition 7. An n-dimensional afRne space over the field F (usually M 
or C) is a triple (S,V, O], where S is a non-empty set, V an n-dimensional 
vector space over F, and O an effective and transitive action O : V x S ^ S 
of V (considered as Abelian group with respect to addition of vectors) on S. 

We remark that an effective and transitive action of an Abelian group 
is necessarily simply transitive. Hence, without loss of generality, we could 
have required a simply transitive action in Definition[7| straightaway. We 
also note that even though the action O only refers to the Abelian group 
structure of V, it is nevertheless important for the definition of an affine 
space that V is, in fact, a vector space (see below). Any ordered pair of 
points {p,q) G S x S uniquely defines a vector v, namely that for which 
p = q -|- V. It can be thought of as the difference vector pointing from q to 
p. We write v = A(q,p), where A:SxS— )Visa map which satisfies the 
conditions 

A(p, q) A(q,r) = A(p,r) for all p,q,rGS, (110a) 

Aq : p 9 S H-) A(p, q) G V is a bijection for all p G S . (110b) 

Conversely, these conditions suffice to characterise an affine space, as stated 
in the following proposition, the proof of which is left to the reader: 

Proposition 23. Let S be a non-empty set, V an n-dimensional vector 
space over F and A : S x S — > V a map satisfying conditions llll(J\) . Then S 
is an n-dimensional affine space over F with action 0(v,p) := Ap^(v). 

One usually writes 0(v,p) =: p + v, which defines what is meant by 
'+' between an element of an affine space and an element of V. Note that 
addition of two points in affine space is not defined. The property of being 
an action now states p -|- = p and (p -|- v) -|- w = p -|- (v -|- w) , so that in the 
latter case we may just write p -\-v-\-w. Similarly we write A(p, q) =: q — p, 
defining what is meant by '— ' between two elements of affine space. The 
minus sign also makes sense between an element of affine space and an 
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element of vector space if one defines p + (— v] =: p — v. We may now write 
equations like 

p + (q-T) = q + (p-r), (111) 

the formal proof of which is again left to the reader. It implies that 

Considered as Abelian group, any linear subspace W C V defines a 
subgroup. The orbit of that subgroup in S through p G S is an affine 
subspace, denoted by Wp, i.e. 

Wp=p+W:={p+w|wG W}, (112) 

which is an affine space over W in its own right of dimension dim(W). One- 
dimensional affine subspaces are called (straight) lines, two-dimensional ones 
planes, and those of co-dimension one are called hyperplanes. 

A. 3 AfRne maps 

Affine morphisms, or simply affine maps, are structure preserving maps 
between affine spaces. To define them in view of Definition[7| we recall once 
more the significance of V being a vector space and not just an Abelian 
group. This enters the following definition in an essential way, since there are 
considerably more automorphisms of V as Abelian group, i.e. maps f : V — > 
V that satisfy f(v +w] = f(v) + f(w) for all v,w G V, than automorphisms 
of V as linear space which, in addition, need to satisfy f(av) = af(v) for 
all V G V and all a G F). In fact, the difference is precisely that the latter 
are all continuous automorphisms of V (considered as topological Abelian 
group), whereas there are plenty (uncountably many) discontinuous ones, 
see [21] IE 

Definition 8. Let (S, V, O) and (S', V, O') be two affine spaces. An afRne 
morphism or afRne map is a pair of maps F : S — > S' and f : V — > V, 

where f is linear, such that 

Fo(l) = 0'ofxF. (113) 
In the convenient way of writing introduced above, this is equivalent to 

F(q+v) =F(q) + f(v), (114) 

Let F = R, then it is easy to see that f(v + w) = f(v] + f(w) for all v, w e V implies 
f(av) = af(v) for all v £ V and all a G Q (rational numbers). For continuous f this 
implies the same for all a € R. All discontinuous f are obtained as follows: let {eAj^gi be 
a (necessarily uncountable) basis of R as vector space over Q ('Hamel basis'), prescribe 
any values f(eA), and extend f linearly to all of R. Any value-prescription for which 
I 9 A I— > f(eA)/eA £ R is not constant gives rise to a non R-linear and discontinuous 
f. Such f are 'wildly' discontinuous in the following sense: for any interval U C R, 
f(U) C R is dense 
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for all q G S and all v G V. (Note that the + sign on the left refers to the 
action O of V on S, whereas that on the right refers to the action O' of 

V on S'.) This shows that an affine map F is determined once the linear 
map f between the underlying vector spaces is given and the image q ' of an 
arbitrary point q is specified. Equation ()114p can be rephrased as follows: 

Corollary 24. Let (S,V, O) and (S',V',0') he two affine spaces. A map 
F : S — > S' is affine iff each of its restrictions to lines in S is affine. 

Setting p := q + V equation (|114p is equivalent to 

F(p)-F(q) =f(p-q) (115) 

for all p, q € S. In view of the alternative definition of affine spaces sug- 
gested by PropositionESl this shows that we could have defined affine maps 
alternatively to ()113p by (A' : S' x S' — ) V is the difference map in S') 

A'oFxF = foA. (116) 

Affine bijections of an affine space (S, V, O) onto itself form a group, the 
affine group, denoted by GA(S,V, O). Group multiplication is just given 
by composition of maps, that is (Fi , f i )(F2, fi) := (Fi o F2 , f i o ij)- It is 
immediate that the composed maps again satisfy (|113p . 

For any v S V, the map F = Ov : p 1— > p+v is an affine bijection for which 
f = idy. Note that in this case ()113p simply turns into the requirement 
Ov o = ° 'I'v for all w G V, which is clearly satisfied due to V 
being a commutative group. Hence there is a natural embedding T : V — > 
GL(S,V, O), the image T(V)of which is called the subgroup of translations. 
The map F 1— > F,,, := f defines a group homomorphism GA(S, V, O) — > GL(V), 
since (Fi o F2)* = f 1 o We have just seen that the translations are in 
the kernel of this map. In fact, the kernel is equal to the subgroup T(V) of 
translations, as one easily infers from (jll5p with f = idy, which is equivalent 
to F(p) — p = F(q) — q for all p, q G S. Hence there exists a v G V such that 
for all p G S we have F(p) = p + v. 

The quotient group GA(S, V, (I')/T(V) is then clearly isomorphic to 
GL(V). There are also embeddings GL(V) — > GA(S,V, O), but no canoni- 
cal one: each one depends on the choice of a reference point o G S, and is 
given by GL(V) 9 f F G GA(S, V, <D), where F(p] := o + f(p - o) for all 
p G S. This shows that GA(S, V, O) is isomorphic to the semi-direct product 

V XI GL(V), though the isomorphism depends on the choice of o G S. The 
action of (a, A] G V x GL(V) on p G S is then defined by 

((a,A),p) ^o + a + A(p-o), (117) 

which is easily checked to define indeed an (o dependent) action of Vx GL(V] 
on S. 
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A. 4 Afflne frames, active and passive transformations 

Before giving the definition of an affine frame, we recall that of a linear 
frame: 

Definition 9. A linear frame of the n-dimensional vector space V over F 
is a basis f = {ea}a=l -n of V, regarded as a linear isomorphism f : F"^ ^ V, 
given by f (v^ , • • • , v"^) := v°^ea- The set of linear frames of V is denoted by 

Since F and hence F^ carries a natural topology, there is also a natural 
topology of V, namely that which makes each frame-map f : F"- ^ V a 
homeomorphism. 

There is a natural right action of GL(F^) on J^y, given by (A,f) ^ f oA. 
It is immediate that this action is simply transitive. It is sometimes called 
the passive interpretation of the transformation group GL(F^), presumably 
because it moves the frames — associated to the observer — and not the points 
of V. 

On the other hand, any frame f induces an isomorphism of algebras 
End(F^) -> End(V), given by A := f o A o f-i . If A = {A^}, then 

A^iea] = A^Cb, where f = {eala^i-n- Restricted to GL(F^) C End(F^), this 
induces a group isomorphism GL(F^) GL(V) and hence an f-dependent 
action of GL(F^) on V by linear transformations, defined by (A,v] h-> A^v = 
f(Ax), where f(x) = v. This is sometimes called the active interpretation of 
the transformation group GL(F"^), presumably because it really moves the 
points of V. 

We now turn to affine spaces: 

Definition 10. An affine frame of the n-dimensional affine space (S, V, O) 
over F is a tuple F := (o, f), where o is a base point in S and f : F^ — > V is a 
linear frame of V. F is regarded as a map F"^ — > S, given by F(x) := o + f (x). 
We denote the set of affine frames by .^(s,v,o)- 

Now there is a natural topology of S, namely that which makes each 
frame-map F : F^ ^ S a homeomorphism. 

If we regard F"- as an affine space Aff (F), it comes with a distinguished 
base point o, the zero vector. The group GA(Aff (F"^)) is therefore naturally 
isomorphic to F^ xi GL(F"^). The latter naturally acts on F"^ in the standard 
way, O : ((a, A),x) i— > 0({a, A),x) := A(x) + a, where group multiplication 
is given by 

(ai,Ai)(a2,A2) = (ai + Aia2 , AiA2) . (118) 

The group F'^x GL(F"-) has a natural right action on ^(s.yo); where (g, F) 
F • g := F o g. Explicitly, for g = (a, A) and F = (o,f), this action reads: 

F-g = (o,f)-(a,A) = (o + f(a),foA). (119) 
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It is easy to verify directly that this is an action which, moreover, is again 
simply transitive. It is referred to as the passive interpretation of the affine 
group X GL(F^). 

Conversely, depending on the choice of an affine frame F G J^{sy,<i>), there 
is a group isomorphism x GL(F"^) — ) GA(S,V, O), given by (a, A) h-> 
F o (a, A) o F^\ and hence an F dependent action of x GL(F^) by affine 
maps on (S, V, O). If F = (o, f ) and F(x] = p, the action reads 

((a,A),p) ^ F(Ax + a) = A^(p - o) + o + f (a) . (120) 

This is called the active interpretation of the affine group F"^ x GL(F"^). 

An affine frame (o,f) with f = {ea}a=^■■■n defines u + 1 points 
{Po iPi ! ■ ■ ■ Ptl}) where po : — o and :— o + Ea for 1 < a < n. Conversely, 
any n + 1 points {po,Pi , • • • Pn) in affine space, for which := pt — po are 
linearly independent, define an affine frame. Note that this linear indepen- 
dence does not depend on the choice of po as our base point, as one easily 
sees from the identity 

mm m 
^v"(Pa-po)= Y. ^"(Pa-Pk), where v°:=-^v", (121) 

a=l k^a=0 a=1 

which holds for any set {po,Pi , • • • .Pra) of m + 1 points in affine space. To 
prove it one just needs (jllip . Hence we say that these points are affinely 
independent iff, e.g., the set of m vectors {ea := Pa ^ Po I 1 < o. < m} 
is linearly independent. Therefore, an affine frame of n-dimensional affine 
space is equivalent to u-|- 1 affinely independent points. Such a set of points 
is also called an affine basis. 

Given an affine basis {po.Pi, • • • ,Pn} C S and a point q G S, there is a 
unique n-tuple (vi , • • • , v^) E F"^ such that 

n 

q =Po + ^v'^(Pa-po). (122a) 

Writing v^(pic — po) = (Pk — Po) + (1 ~ v'^)(Po ~ Pk) for some chosen k e 
{!,••• , n} and v'^(pa - Po) = v°-[Va - Pk) - v"(po - Pk) for ah a / k, this 
can be rewritten, using (jlll|) . as 

n n 
q=Pk+ Y. ^''(P^-Pk). wliere v°:=1-^v'^. (122b) 

k^a=0 a=1 

This motivates writing the sums on the right hand sides of (jl22p in a per- 
fectly symmetric way without preference of any point pi^: 

n n 

q = ^v>a,, where ^v'' = l, (123) 
a=0 a=0 
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where the right hand side is defined by any of the expressions ()122p . This 
defines certain linear combinations of affine points, namely those whose co- 
efficients add up to one. Accordingly, the affine span of points {pi , • • • , Pm) 
in affine space is defined by 
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